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PHASES IN THE DEVELOPMENT OF CHILDREN’S PAINTING 


VERA BEACH 
Baby Institute, New York City 


Mary H. BRESSLER 
Board of Education, New York City 


When children begin painting at a very 
early age, and continue it in a relatively free 
emotional atmosphere, they progress, with 
some skips forward and backward, through 
various developmental phases. The length of 
time spent at any one phase level varies some- 
what with each child. One child may spend 
two weeks at a certain level, while another 
spends two years. Transition from one phase 
to another, with elements of one or several 
phases, may be found in a single painting. 

The present writers believe that children 
paint most easily and with greatest satisfac- 
tion in later years when they have had experi- 
ence in the first four developmental phases. 
Experience in these four phases allows easy 
transition to representational work. 

Anyone who has given a child of two or 
three years a broad paintbrush, a large piece 
of unprinted newspaper, and one or two 
colors of poster paint has seen a child’s pre- 
representational painting. Teachers often miss 
seeing pre-representational work, because 
children entering school start painting after 
much pencil and crayon drawing, and even 
early drawings tend to be representational; 
that is, look like people, animals, houses, etc. 
Drawing is a very different medium from 
painting, and by the age of five and one half 
or six years considerable realism and form 
have emerged. The influence of drawing, plus 
the question, “What is it?” asked either 
directly or by implication, results in these 
children’s coming to their first painting ex- 
perience with the idea of painting something 
—something that can be named and approved. 

In nursery schools enough freedom is usu- 
ally given to children in all areas of the daily 
program to permit their individual expression 
of mood and body feeling in painting. In most 
schools for children over six years there is 
not only less emotional freedom, but there is 
the specific restraint that comes from recog- 


nition of representational painting as the 
acceptable product. Too often mastery of 
form becomes a fixed standard in the child’s 
mind, even though such a standard has never 
been put into words by the teacher. 


Thus, in many nursery schools, teachers 
have learned to accept and appreciate the 
early work in painting that is natural to chil- 
dren and necessary for their all-round growth 
and development. In many kindergartens and 
first and second grades, the value of any 
variation from the usual representational 
painting may be overlooked. 

When the teacher has a knowledge of the 
phases of continuous growth, this awareness 
in itself provides an interest and enthusiasm 
en her part that serve as a stimulus to the 
child’s development. At first the teacher may 
not be able to articulate all the aspects of 
growth that she sees taking place. 


DEVELOPMENTAL PHASES 


After a study of many paintings done by 
children in the age range of two to seven 


years, in day nurseries, private nursery 
schools, public school classrooms, and special 
“art” rooms, the writers distinguish five de- 
velopmental phases. The fifth phase, which 
includes the highest development in art, is 
rarely attained at seven years. These five 
phases are: 

1. Relatively uncoordinated scrubbing, 
Sweeping, spreading color on the page.—This 
phase is exploratory. The arm swings freely. 
The child learns about the liquid spreading 
quality of the paint; about the brush, what 
it does, how the paper resists the stroke of 
his brush; what pressure tears the paper, how 
paint is absorbed; about the transfer of paint 
from jar to paper. The child’s body move- 
ments are random; paint may run and spatter. 
The result may be chaotic. 











2. Accidentally attained design—Lines, 
dots, short strokes, or color areas are left as 
distinct. areas or patterns on the paper, in- 
stead of merging imperceptibly one into the 
other. 

More coordination characterizes this phase. 
The design element is accidental—through 
outside interruption or disturbance, change of 
mood, or shift in attention. The child wants 
to leave the line, swab, swing, or color area. 
He begins to look at what is before him and 
takes pleasure in the accidentally attained 
design and the masses of color. His mood 
determines choice of color, ability to stop, 
placement on the paper. The final organiza- 
tion, what the child sees as a finished per- 
formance, molds the total integrative experi- 
ence. 

3. Consciously sought design.—This is usu- 
ally an implicit use of the whole page, even 
if the whole page is not covered. Paintings in 
this phase vary from the first crude efforts 
of young children toward design to the mod- 
ern abstractions of sophisticated adult artists. 

The painting may not now be entirely pre- 
planned, but the area of the paper is realized 
as to size and quality, and colors are chosen 
deliberately. Balance and design emerge. 
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Some simple phase three paintings are striped, 
some are in large blocks of color, and some 


fill in two broad diagonals. A pleasing design j 


may be repeated many times. If pictorial rep- 
resentation is present, we arbitrarily classify 


the painting as phase four, or as phases three § 


and four mixed. 


4. Representation.—Paintings in this phase / 
range from the first house or man of a young 
child to the primitive of adult artists. This 9 
phase does not use perspective. It is the one 7 


most frequently seen by first- and second- 


grade teachers. The first efforts of many six- 7 
and seven-year-old school children fall in | 
phase four. It is common for such children to | 
scrub in the portrayal of sky or other large | 
areas, just as they do in phase one. Or they | 
may start to paint a scene and lapse into pure | 


design. These children need more experience 
in the earlier phases. 


5. Full realization of representation and 
dcsign.—There is more highly developed com- © 


munication of feeling and idea. If realistic 


form is used, perspective is utilized, but this 7 


phase, which includes the most mature art 
work, need not necessarily be realistic. 

The accompanying plates reproduce illus- 
trations of the first four phases (Figures 1-4). 





Fig. 1. Phase 1 in Children’s Painting. 
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Fig. 2. Phase 2 in Children’s Painting 
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3. Phase 3 in Children’s Painting. 
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Fig. 4. Phase 4 in Children’s Painting. 


These samples from the authors’ collection 
were done by children who have had a great 
deal of opportunity to paint. Children with 
infrequent or markedly delayed experience do 
not always show clear-cut development. Their 
paintings may combine several phases, often 
because they strive too early for the fourth 
level, which they cannot sustain easily. They 
usually need to go back and work in earlier 
phases. 

At each phase the child is growing in both 
mastery of the medium and in ability to com- 
municate his dominant mood or idea. In early 
mastery we include quality of brush stroke, 
absence of dripping, and use of solid color 
areas as opposed to drawing and filling in 
outlines. 

By communication we mean the degree to 
which the painting expresses for the painter 
and to the onlooker the experience (physical, 
kinesthetic, intellectual, emotional) that has 
been set down in color and space. It repre- 
sents an integration of inner feeling and ex- 
ternal experience. The more deeply “felt’’ the 
experience, the more closely is communication 
achieved. 

Much of our art evaluation and criticism 
has centered on mastery, to the exclusion of 


communication. Too often, though, attention 
to first steps in mastery has been neglected in 
the early phases of painting, and over-stressed 
in the later phases. The authors believe there 
should be a balance in emphasis on mastery 
and communication at all levels of painting. 


Some teaching is valuable in all phases. 
The teacher has a role to play that becomes 
more specific and clear as paintings are 
watched and studied. Awareness of the value 
of pre-representational painting will stimulate 
more understanding of the child’s develop- 
ment and of how to assist in it. In later 
articles the writers will expand on: 


1. Specific ways to help a child grow within 
that phase level which he sustains with satis- 
faction. 

2. How to look at children’s paintings and 
judge them: for what they are doing to the 
child, in what way and how the child is com- 
municating his experience, and what the 
teacher can do to bring about the next step 
in the child’s progress. 

3. Desirable painting set-ups in the day 
nursery or classroom that produce more satis- 
fying painting; and what to do with groups 
of five, ten, twenty, or more children. 
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Many difficult concepts are presented in 
general statements in textbooks, which com- 
prise a major portion of the reading material 
provided for pupils in the elementary grades 
of the country. There is no indication at 
present that the widespread use of textbooks 
will be decreased in the near future. Numer- 
ous investigations have shown that there is 
no grade level where pupils of average ability 
are able to read with adequate understanding 
the textbooks provided for them. One reason 
why textbooks are inadequately understood 
is that they contain many general and ab- 
stract statements which do not provide suffi-. 
cient detail for pupils with limited experien- 
tial and linguistic backgrounds. As a result, 
such pupils are prone to memorize the lan- 
guage about fatts instead of gaining actual 
knowledge and understanding. 

The primary purpose of this investigation 
was to determine the effect of amplification 
of general statements upon the reading com- 
prehension of children in the intermediate 
grades. 


MATERIALS FOR THE STUDY 


The materials for this experiment were 
written by the investigator for children in the 
upper elementary grades of North Louisiana. 
Three articles, each three hundred words in 
length, using general statements similar to 
those found in textbooks, were written on 
some aspect of paper: one on paper in the 
present wartime setting, one on the history 
of paper, and one on the paper industry in 
the United States. Each three-hundred-word 
article was then expanded into two lengths: 
an article six hundred words in length and 
one twelve hundred words in length. State- 
ments from each of the three-hundred-word 
articles were retained verbatim in the two 
expanded versions, while statements from the 
six-hundred-word articles were also retained 
verbatim in the twelve-hundred-word ver- 
sions. Details for the amplified versions were 
determined on the basis of this question: 
What can be added to enable a typical 





THE EFFECT OF AMPLIFYING MATERIAL 
UPON COMPREHENSION 


Mary CAROLINE WILSON 
Louisiana Polytechnic Institute 


Louisiana child to use his experiential and 
linguistic background in attaining adequate 
comprehension of the concepts presented? 

According to the Gray—Leary formula for 
predicting difficulty, the reading materials 
were all well below the reading level for 
children participating in the experiment. 

An example of the amplification provided 
in the six-hundred- and twelve-hundred-word 
versions for a general statement concerning 
clay tablets may be seen in the following 
excerpt from “The History of Paper.” 

Short Version (Three-hundred-word arti- 
cle). Clay tablets, papyrus, and skins were 
used before paper was discovered. 

Doubled Version (Six-hundred-word arti- 
cle). Clay tablets, papyrus, and skins were 
used before paper was discovered. Men shaped 
wet clay into thick squares and marked upon 
them with sharp sticks. These tablets were 
baked much as bricks were baked, but records 
on clay were not convenient. 

Long Version (Twelve-hundred-word arti- 
cle). Clay tablets, papyrus, and skins were 
used before paper was discovered. Men shaped 
wet clay into thick squares and marked upon 
them with sharp sticks. While clay tablets 
were made in different forms, many of them 
were shaped something like large shredded- 
wheat biscuits. These tablets were baked 
much as. bricks were baked, but records on 
clay were not convenient. Clay was not a 
good material for record keeping because it 
was too easily broken. 

In order to appraise the comprehension of 
different groups of children who read the 
different versions of material, three compre- 
hension tests were prepared. Identical tests 
were used to measure the comprehension of 
the different groups of children who read the 
short, doubled, or long versions of an article. 
Five different types of tests were included in 
each comprehension test: (1) four free- 
expression questions requiring written an- 
swers were designed to measure the four major 
concepts of an article; (2) forty multiple- 
choice items with four choices each were 
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planned to test the knowledge of facts and 
the ability to do inferential thinking; (3) 
word-meaning tests comprised of those words 
from each of the short articles that did not 
appear in the first four thousand of the 
Thorndike Word List, those words listed for 
the fifth grade or above in the Buckingham- 
Dolch List, and those that appeared in the 
Cole Technical Vocabulary; (4) ten picture- 
selection items with four choices each; (5) 
oral interviews, based on questions designed 
to test the comprehension of pupils concern- 
ing the various concepts embodied in each 
article. 


DESIGN OF THE EXPERIMENT 


Four hundred and five pupils just entering 
the sixth and seventh grades participated in 
this study. All children were residents of one 
parish in North Louisiana, some from rural 
homes and some from town homes. Each pupil 
was tested under three different test condi- 
tions; namely, after having read a three- 
hundred-word article, after having read a six- 
hundred-word article, and after having read 
a twelve-hundred-word article, each of which 
treated a different theme pertaining to paper. 
For administration purposes, classroom 
groups were divided into three sections 
through the technique of random selection. 
The tests were administered by the investi- 
gator with the assistance of classroom teach- 
ers. Ninety children were tested individually 
and interviewed orally. All oral interviews 
were conducted by the investigator immedi- 
ately after a child had read an article and 
had taken the pencil-and-paper test. Re- 
sponses to the oral-interview questions were 
recorded by the interviewer and typed in per- 
manent form on the day the interview was 
held. 


To minimize the effect that might result 
from reading the material in any particular 
order, the testing program was arranged so 
that the pupils in the different schools, as 
well as the ninety pupils who took the tests 
individually, read the materials in different 
order. Pupils in school I took the tests in the 
order of long, doubled, short material; school 
II in the order of short, doubled, long mate- 
rial; school III: doubled, short, long mate- 
rial; schools IV and VII: long, short, doubled 
material; school V: doubled, long, short 
material; and school VI: short, long, doubled 
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material. Ten minutes were allowed for read- 
ing the articles regardless of the version. 

The following scheme was used in admin- 
istering the testing materials: 


Section I 
“Paper Today” 
300 word article 
“Paper Industry” 
600 word article 
“History of Paper” 
1200 word article 

Section II 
“History of Paper” 
300 word article 
“Paper Today” 
600 word article 
“Paper Industry” 
1200 word article 

Section III 


“Paper Industry” 
300 word article 
“History of Paper” 
600 word article 
“Paper Today” 
1200 word article 


Through following this procedure, it was 
possible to obtain 135 cases for each of the 
nine different test conditions. For the entire 
investigation, there was a total of 1,215 writ- 
ten test results: 405 with the short three- 
hundred-word articles, 405 with the doubled 
six-hundred-word articles, and 405 with the 
long twelve-hundred-word articles. 

To determine the I.Q. of each pupil partici- 
pating in the experiment, the Kuhlmann— 
Anderson Intelligence Tests were given. The 
Iowa Silent Reading Tests were administered 
to determine the reading ability of each pupil. 
Data from these tests were used to determine 
the effect of amplification upon the compre- 
hension of pupils at different levels of ability. 


Test RESULTS 


Reliability of the comprehension tests was 
secured by the “odds-even” method and the 
results were stepped up bv use of the 
Spearman—Brown Prophecy formula. Result- 
ing coefficients of reliability were .88 for 
“Paper Today,” .86 for “The History of 
Paper,” and .go for “The Paper Industry.” 

Mean scores for the written tests on the 
short, doubled, and long versions were in the 
case of “Paper Today” 44.62, 45.28, and 
45.96; in the case of “The History of Paper” 
45.02, 43.07, and 48.99; and in the case of 
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“The Paper Industry” 43.44, 48.01, and 
49.71 respectively. 

To determine the differences of significance 
in comprehension through reading the differ- 
ent versions of material, critical ratios were 
computed for the comprehension tests. Criti- 
cal ratios indicating a significant difference in 
comprehension were 2.66 favoring the group 
that read the long rather than the group 
which read the short version of “The History 
of Paper,” 3.82 favoring the group that read 
the long rather than the group which read the 
doubled version of “The History of Paper,” 
2.90 favoring the group that read the doubled 
rather than the group which read the short 
version of “The Paper Industry,” and 4.01 
favoring the group that read the long rather 
than the group which read the short version 
of “The Paper Industry.” In the case of 
“Paper Today,” the differences in comprehen- 
sion resulting from reading the various ver- 
sions were so slight as to be attributable to 
chance variations in the random selection of 
the groups. 

For pupils in the group with the highest 
fourth of intelligence, critical ratios indicating 
statistically significant differences in compre- 
hension were 3.57 favoring the group that 
read the long rather than the group which 
read the short version of “Paper Today,” 
3.85 favoring the group that read the long 
rather than the group which read the short 
version of “The History of Paper,” and 2.68 
favoring the group that read the long rather 
than the group which read the doubled ver- 
sion of “The History of Paper.” For pupils in 
the group with the lowest fourth of intelli- 
gence, critical ratios indicating statistically 
significant differences in comprehension were 
2.69 favoring the group that read the long 
rather than the group which read the doubled 
version of “The History of Paper,” and 2.97 


. favoring the group that read the long rather 


than the short version of “The Paper Indus- 
try.” In no case was a significant advantage 
to comprehension shown for the groups that 
read the short version of articles. 

Averages for the percentages of correct 
items were computed for each of the four 
different types of written tests. With one ex- 
ception, the long version of the word-meaning 
test for “Paper Today,” averages for the per- 
centages of correct items were higher for the 
groups that read long versions than for groups 
which read the short versions of articles. 


AMPLIFYING MATERIAL UPON COMPREHENSION 7 





Conceptual difficulties, even more than 
structural difficulties, caused the reading 
materials to be difficult. Through application 
of the Gray-—Leary formula for predicting 
structural difficulties, the long version of 
“The History of Paper” was found struc- 
turally to be almost two reader-grade levels 
more difficult than the short version. Yet in 
every analysis made of the test results, com- 
prehension was definitely facilitated through 
reading the long, but structurally more diffi- 
cult, version. It seems reasonable to attribute 
the differences found in comprehension to the 
meager statements in the short article and to 
the amplification provided in the long article. 


Concepts found most difficult for groups 
that read the short versions of articles were 
found to remain relatively the most difficult 
for other groups which read the long versions 
of articles. Most of these concepts were made 
less difficult by amplification. In general, test 
items that measured comprehension of con- 
cepts farthest removed from the experiences 
of the children showed the highest percentage 
of accuracy for the group which read the de- 
tailed version. In the case of certain concepts, 
however, amplification was not shown to 
facilitate comprehension. 


A careful examination of oral-interview 
responses indicated more advantageous effects 
from amplification than were displayed in re- 
sponses on the pencil-and-paper tests. Greater 
interest and more logical reasoning were 
shown by pupils’ responses to oral-interview 
questions in the case of concepts that the 
children could relate to their experiential 
backgrounds. Three principal advantages of 
the oral interview were: first, the interviewer 
had an opportunity to make certain that the 
pupils understood the questions; second, the 
interviewer could question pupils beyond the 
point of a verbalistic answer and could gain 
some insight into how they had related and 
organized the materials read; third, pupils 
who made initial inadequate or erroneous re- 
sponses could be further interrogated to deter- 
mine whether actual misconceptions existed. 


CONCLUSIONS 


Data from this investigation indicate that, 
in general, amplification of reading materials 
is advantageous to comprehension. These 
data, however, do not warrant making ex- 
travagant claims for amplification. The effect 
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of amplified reading materials upon compre- 
hension is largely dependent upon the ability 
of writers to anticipate the difficulty of con- 
cepts and to supply essential details in a skill- 
ful manner. Since it is difficult to predict with 
certainty the nature of the difficulties that 
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children will encounter in reading a given 
selection, it seems desirable that before pub- 
lication authors should test their materials 
with a number and variety of children, and 
then revise the subject matter in the light of 
the difficulties revealed. 





GROWTH AND ACHIEVEMENT IN BASIC ENGLISH SKILLS 


FLORENCE M. LuMSDEN : 


Woodrow Wilson High School 
Washington, D. C. 


THE PROBLEM 


This study is an investigation of the 
achievement and growth in certain English 
skills of students during the first half of their 
attendance in a Washington, D. C., senior 
high school, situated in a socially and eco- 
nomically favorable environment. Limitation 
of the investigation to certain areas of Eng- 
lish skills made it possible to obtain specific 
data in these areas. However, the findings 
should be viewed by curriculum workers in 
relation to the entire field of instruction in 
English. Deficiencies in these skills may be 
compensated by achievement in other areas. 
The aim of this study was to provide objec- 
tive data for possible use, if it is deemed that 
a high degree of proficiency in such skills 
should be an outcome of secondary education 
in this school or a comparable one. 

The method used was the test and re-test 
technique, and the examination of errors by 
item analysis. All classes in the semesters of 
English that were tested were involved in the 
study, not just the classes of one or two 
teachers. It is important to remember this, 
because the use of all classes provided a 
measure of educational progress under typical 
conditions, with teachers following the same 
course of study, but varying their methods 
according to the patterns they have developed 
from their own experience. In other words, 
the investigator was concerned with what 
happens in a large urban high school in a good 
community, with a faculty representative of 
such institutions as this high school, under 
the usual procedures in the classroom. 

The term, “basic language skills in Eng- 
lish”, in this study refers to the ability to 
give the correct answers on test items in the 
areas of punctuation, capitalization, usage, 
and spelling. 

Primarily, then, this study was concerned 
with the “what” of learning in these areas of 
English skills, not with the “how” or methods 
of instruction. Not all areas of learning lend 
themselves so readily to measurements of 
progress; it is for this reason, rather than 


that skills are considered the only or most 
important of educational outcomes, that this 
detailed examination of conditions was made. 


To secure data for the study three testings 
were arranged, one in the first semester of the 
tenth year, one at the end of the tenth year, 
and one at the end of the first half of the 
eleventh year. Henceforth in the report these 
testings will be called 10A, 10B, and 11A re- 
spectively. Because of the administrative de- 
tails involved in so large a program, it was 
not possible to give the first test until the 
pupils were well in the 10A semester, namely, 
in January, 1942. The second testing took 
place in June, 1942, at the end of the 10B 
semester. The third testing was in January, 
1943, at the end of the 11A grade. The first 
testing constituted a survey of school achieve- 
ment in basic English skills for pupils who 
had entered the Woodrow Wilson High School 
the previous fall, that is, in September, 1941; 
they had spent three and one-half months in 
senior high school. The second testing was a 
progress study, made to sustain the interest 
of the school in the study and to see whether 
progress was being made. A slight loss was 
shown in punctuation, but there was a gain 
in all other areas. No item analyses were 
made of this testing, and, therefore, it is not 
referred to in the later sections of this report, 
which are based on statistical treatments of 
the first and third testings. The third testing 
was a survey of achievement in these skills at 
the end of the first semester of the 11th grade, 
which was primarily a composition semester ; 
the last half of the eleventh year and the 
twelfth year focus attention on literature 
rather than on language arts. 


The test used in this study was the Iowa 
Basic English Skills Test, Advanced, Form 
M. The test forms were distributed at the 
beginning of the testing period and collected 
at the end of the period; English teachers 
who proctored the tests saw the forms only 
during the test; copies were not available for 
any classroom drill on specific items. All three 
testings were administered through the school 
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office and involved all pupils in the semester 
of English being tested. Although the norms 
for this test do not go above the first semester 
of the tenth year of school, they are so de- 
tailed for grades below the tenth that central 
tendencies of growth are established by them. 
In addition, the glaring deficiencies that 
worry curriculum workers, teachers, and ad- 
ministrators are in the skills that should have 
been mastered before high school; for reme- 
dial purposes grade-school norms are sufficient 
for high school. 


*The usual techniques of establishing valid- 
ity for the test were used. In a meeting held 
by the head of the department of English in 
the District of Columbia public schools, it 
was agreed that the skills covered in the test 
were fundamental. The types of items in the 
test were checked against the material in the 
textbooks which the teachers reported they 
had used in the classes tested. On the basis 
of textbook content and reported use of these 
tests, the tests were valid. 


The course of study in use during the 
period of this study called for twelve weeks 
of work in the language arts in the 10A grade 
in both intensive and extensive groups.’ In 
the 10B the intensive classes were to devote 
five weeks to composition and grammar; in 
the 11A these classes were to devote thirteen 
weeks to these areas of English. In the 1oB 
the extensive groups were to devote nine 
weeks to composition and grammar; and, in 
the 11A, thirteen weeks. Both written com- 
position and oral composition were to be 
practiced, but no rules were laid down about 
the amount of time to be devoted to each 
form of expression. During the period of this 
study there were thirteen sections of intensive 
English and two of extensive English in the 
school. 


The course of study is not a detailed sylla- 
bus in the city of Washington. It is a flexible 
instrument, deliberately made so in order that 
schools may adapt its offerings to community 
needs. The time allotments are suggested, but 
even these may be modified, if the teachers 
and principals see the need for change. Meth- 

1 These groups are defined by the school officials as follows: 
“Intensive English indicates work that is meant as a challenge 
to the ability of excellent and superior pupils who are good 
readers. These are usually the academic, the high business 
and high average groups to whom are given college certification. 

“Extensive English indicates a general, practical course for 
pupils who are not literary-minded. These are, for the most 


part, pupils of low average or below average ability in English. 
For them high school English is usually a terminal course.” 
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ods are suggested, but not dictated to 
teachers. 

All the teachers of the pupils represented 
in this study were people of experience in the 
local school system, with terms of teaching in 
the Washington schools from eight to thirty 
years. All the teachers at some time had 
served on committees to study problems in 
the field of English in the city system. 

These data about the school, its curriculum, 
and its faculty are given to explain that the 
conditions of the study were those of the 
average large urban high school. There was 
no attempt to control conditions or to suggest 
methods of teaching to teachers; on the con- 
trary, every effort was made to avoid any 
comparison of class with class, or teacher with 
teacher. The entire purpose of the investiga- 
tion was to make a vertical study, as objec- 
tively as testing will permit, of achievement 
and growth in these skills through the semes- 
ters in which the greatest emphasis is placed 
upon these skills. 


FINDINGS 


Before attempting to summarize the 
achievement of pupils after they had been in 
the school three semesters, it was necessary 
to separate the tests of those who had gone 
through three semesters in this school from 
those of the entire group tested. Considerable 
change had taken place in the school popula- 
tion during the three semesters, for of the 469 
pupils who were tested the first time, only 260 
were in school for the final testing. 


The papers of these 260 for both 1942 and 
1943 were segregated for detailed study. They 
were first arranged into eight groups on a 
decile 1.Q. scale from 150 to 70. The median 
1.Q. was 115.1; the mean, 114.7; and the 
standard deviation, 10.6. 

Two types of approaches were made in the 
statistical treatment: (1) a study of achieve- 
ment through summaries of correct responses, 
and (2) a study of error through item anal- 
yses and tabulation of error. Without going 
into the details of the statistical data through 
the presentation of tables, the writer will 
sketch for the reader the general picture in 
achievement and the detailed picture of diffi- 
culties shown by the studies of error. 

In relation to standard performance as 
measured by national norms, the Woodrow 
Wilson pupils in the 10A testing were below 
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expected achievement. During the period of 
the study, growth was made in all areas. A 
tendency toward a negative correlation be- 
tween achievement in these basic skills was 
shown by the co-efficients of correlation be- 
tween intelligence and achievements in terms 
of total score in the 10A and in the 11A; at 
the beginning of the study that correlation 
was —.69; at the end, it was —.83. Correla- 
tions between rank in gain and decile I.Q. 
rank likewise showed negative tendencies in 
each of the four areas, as follows: punctua- 
tion, —.80; capitalization, —.59; usage, 
—.47; spelling, —.90. It is recognized that 
gains become more difficult nearer the top of 
the scale and, therefore, rank in gain might 
be influenced by the position on the scale at 
which the gain is made. The areas in which 
the greatest negative correlations were shown, 
punctuation and spelling, were those in which 
the initial achievement was lowest and, there- 
fore, areas in which gains were being made 
farther down the scale than in the other two 
tests. 

Performance was found to be not predict- 
able on the basis of intelligence. However, 
the gains of certain groups in certain areas, 
e.g., achievement of perfection by the highest 
groups in the area of punctuation, indicate 
that much more can be expected of pupils 
than is currently demanded. 

At the end of three semesters ‘in senior high 
school, after which more attention is given to 
the study of literature than to these basic 
skills, these pupils showed a general profi- 
ciency of go per cent. On the separate sub- 
tests they had median scores as follows: punc- 
tuation, 89.1 per cent; capitalization, 90.5 
per cent; usage, 91.9 per cent; and spelling, 
87.8 per cent. Here one might pause to ask 
whether this is enough, but let us proceed to 
additional facts. 

Again in the form of a textual summary, 
rather than by charts and tables, the findings 
of the second part of this study are presented, 
concerned with the diagnoses of difficulty in 
the separate areas of the test. Item analyses 
of the four subtests comprising the whole test 
were made of the items missed on the first 
and last testings. The records were made 
according to the learner’s place in one of the 
eight decile groups of 1.Q.s. Three tabulations 
of these items were then made: (1) items 
missed only on the first testing, (2) items 
missed only on the last testing, and (3) items 
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missed both times. This arrangement was 
adopted in order to discover: (1) which items 
are most frequently unlearned when pupils 
enter senior high school, (2) which items are 
most frequently forgotten, and (3) which 
ttems are most persistently missed or most 
difficult to learn. In addition to this informa- 
tion, which could have been obtained by a 
random handling of tests, tabulation by 1.Q. 
groups furnished information about which 
learners missed these various items. 


After these tabulations had been made, 
ranking lists of items were made correspond- 
ing to the time when the errors were made; 
the most frequently missed items were placed 
at the top of the lists, and the others arranged 
below in the order of frequency of error. 
These ranking lists were then studied to dis- 
cover whether the same items were found at 
or near the top of all three lists. 

Then special attention was given to the 
ranking list of “forgotten” items (those 
missed on list testing, but not on the first) 
and to the “persistent errors” (those items 
missed at the end as well as at the beginning 
by the same pupils). In all cases of this close 
scrutiny, attention was confined to those 
items missed by 26 (that is, by 10 per cent) 
or more of the learners; actually, of course, 
the items involved were missed in the first 
or last testing by more pupils than the sepa- 
rate lists show, for the learners who missed 
the items both times were tabulated sepa- 
rately. But by limiting attention in each case 
to those items unlearned at the beginning, 
forgotten, and persistently missed by 10 per 
cent or more of the pupils tested, one could 
get a fairly good idea of the difficulty of these 
items and a chance to compare the items on 
the separate lists without becoming lost in a 
host of figures. 

Because the achievement results already 
presented showed such a lack of relationship 
between intelligence and accomplishment, the 
question of relationship of ability to difficulty 
of subject matter naturally arose. Did the 
slower pupils show greater gains in achieve- 
ment by mastering the easier items and con- 
sistently missing the greater percentage of 
difficult items? Did they do more “forgetting” 
than the brighter pupils? These were ques- 
tions to be investigated through item analyses 
of errors. 

After the persistent errors and forgotten 
items were identified, therefore, the percent- 
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ages of error on these items by the eight decile 
1.Q. groups were computed. The group show- 
ing the lowest percentage of error was ranked 
z, and the group showing the highest percent- 
age of error was ranked 8. Co-efficients of 
correlation were then computed by the rank- 
difference method to discover any tendencies 
toward relationships between high intelligence 
and low percentage of error on items easily 
forgotten or persistently missed. The findings 
in each of the areas are presented in the sec- 
tions that follow. 


Punctuation—The results for punctuation 
are: 


1. Items ranking high in difficulty at the 
end, also rank high in the beginning. 

2. Items ranking low in difficulty at the 
beginning are likewise low in the list of for- 
gotten items and in the list of persistent 
errors. 


_ 3- Definite improvement in the number of 
items learned is shown by the lower percent- 
age of items missed in the last testing. 


4 Of the sixteen items persistently missed, 
eight (50 per cent) are due to over-punctua- 
tion. 

5- Correct use of quotation marks in direct 
discourse and correct use of apostrophes 
account for most of the other persistent errors. 

6. High on the list of persistent errors is 
the failure to use the second comma in setting 
off the name of a state in a sentence. 


7. The failure to recognize a declarative 
sentence when it occurs in combination with 
an interrogative quotation may be considered 
part of the difficulty of punctuating direct 
discourse. 

8. Some correlation between I.Q. rank and 
low percentage of errors on persistently missed 
items prevails in the area of punctuation. 
This is in contrast to the other areas, as will 
be shown later. Punctuation, therefore, might 
be considered as a more difficult area of the 
mechanics of written English, requiring more 
drill with slower pupils; but since forgetting 
occurs with equal ease among all groups, it 
would seem that more time should be placed 
upon punctuation in all classes. 

g. All the items most frequently forgotten 
involve over-punctuation. It would seem that 
retention of skill in punctuation could be 
greatly improved by making conscious and 
habitual the application of this rule: Use 
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punctuation marks only when you know the 
reason for such use. 

10. Correlations between I.Q. rank and low 
percentage of error on forgotten items range 
from —.go to .g5. No group, therefore, seems 
to have an option on the ability to forget. 

11. The strong tendency among the for- 
getters is to over-punctuate; this suggests 
uncertainty, an uncertainty probably growing 
out of insufficient practice. More practice 
surely could improve the over-use of a comma 
between a word and the modifier immediately 
preceding it. The other two commonest errors 
of over-punctuation are the use of a comma 
to make a trivial pause and the use of a 
comma to set off a restricting clause. Such 
situations involve reasoning. In reasoning 
situations, of course, mere drill will not 
suffice. Since the correlations between intelli- 
gence and low percentage of persistent error 
have some significance, it appears that in the 
area of punctuation, a skill involving a great 
deal of reasoning, teachers are doing a good 
job. 

Capitalization—tThe findings for capital- 
ization are as follows: 


1. Items ranking high in difficulty at the 
end rank high also at the beginning. 

2. Items ranking low in difficulty at the 
beginning are in general also low on the lists 
of forgotten items and persistent errors. Here, 
as in punctuation, it appears that some sub- 
ject matter is more difficult than other parts 
and remains so. 


3. Definite improvement in the number of 
items learned is shown. 


4. Eight of the fourteen persistent errors 
are caused by over-capitalization. Over- 
capitalization seems to be less likely to occur 
among brighter learners than among slower 
ones. 

5. The item highest on the list of persistent 
errors tests the capitalization of the word used 
in place of a person’s name in the salutation 
of a business letter. 

6. The other persistent errors not caused 
by over-capitalization involve the capitaliza- 
tion of names of political bodies, nouns that 
designate definite geographical portions of the 
country, proper adjectives, titles of respect, 
and names of rivers, oceans, and mountains. 


7. Six of the eight forgotten items involve 
over-capitalization. “To use capitals only 
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when you know the reason” is a general rule 
that cannot be too strongly emphasized. 

8. The item ranking highest among the 
forgotten items is also highest among the per- 
sistent errors. It tests the capitalization of the 
word used in place of a person’s name in the 
salutation of a letter. 

g. The correlations with intelligence indi- 
cate that it is about equally easy for all I.Q. 
groups to forget. 

10. The same tendency shown among “for- 
getters” of punctuation appears here; that is, 
they show a tendency to over-punctuate or 
over-capitalize. 

Usage.—The conclusions for usage follow: 

1. Items ranking high in difficulty at the 
end rank high also at the beginning. 

2. Items ranking low in difficulty at the 
beginning are in general also low on the lists 
of forgotten items and of persistent errors. 

3. Definite improvement in the number of 
items widely missed is shown. 

4. The items ranking 1 and 2 on the list 
of persistent errors test the skill of having 
subject and predicate agree in number. The 
applications of the rule, however, are not 
“garden-variety” uses. One sentence has a 
subject connected by “neither-nor”; the other 
is introduced by the expletive, “there”, and 
the subject comes after the verb. 

5. Of the seven other persistent errors, four 
involve verb forms, two represent miscella- 
neous word forms, and one involves a redund- 
ancy. Curiously enough, on the item testing 
the participial form, “lying”, there is no sig- 
nificant correlation between high I.Q. rank 
and low percentage of error, but on the item 
testing the past tense, “lay”, there is the 
highest correlation of all (.97) in this group 
of persistent errors. Apparently no group is 
more apt to say “laying” for “lying” than any 
other. Might the explanation. be that “lying” 
is used more orally than is the past tense, 
“lay”? 

6. In general, the correlations between in- 
telligence and low percentage of error are so 
variable as to be noteworthy. Since usage is 
so largely related to environment, scrutiny of 
these relationships raises the question of 
whether school or society is the educator. 

7. On forgotten items the correlations be- 
tween high rank in I.Q. and low rank in per- 
centage of error are as variable as they are 
among the items persistently missed. 
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8. Six of the eight items on the “forgotten” 
list are also on the “persistent error” list of 
nine items. Although the pupils did best on 
general scores on the usage test and, although 
improvement in terms of median achievement 
was greatest on this test, the percentage of 
forgetting and persistent failure is very close. 

g. It is possible that the percentage of for- 
getting represents an irreducible minimum, 
the inevitable human error. To dismiss the 
matter on that assumption would be easy, 
but it is a conclusion that should be accepted 
only after all methods to improve the situa- 
tion have been tried and found wanting. With 
the increasing emphasis on all language study 
in the schools today, we do well to look to 
achievement in our own language. The func- 
tional approach to grammar need not be aban- 
doned in the effort to achieve mastery. Learn- 
ing the rule that a verb should agree with its 
subject in number need not cripple the per- 
sonality of a normal child. Mastery of the 
principles of relationships between words 
means ease in the acquisition of another 
tongue, since the relationships between words 
know no geography; the methods of express- 
ing such relationships are comparable the 
world over. Mastery of these methods in one 
part of the world makes relatively easy com- 
parison and understanding of other slightly 
different methods in foreign tongues. 

Spelling —The findings for spelling are: 

1. Items ranking high in difficulty at the 
end rank high also at the beginning. 

2. In most cases the words ranking high 
among the forgotten items rank high also 
among the persistent errors. 

3. Items ranking low in difficulty at the 
beginning are in general also low on the lists 


.of forgotten items and of persistent errors. 


4. Definite improvement in the number of 
items widely missed is shown. 

5. The co-efficients of correlation between 
high I.Q. rank and low percentage of error 
on persistent errors vary from —.04 to .80, 
indicating wide distribution of the tendency 
to err among all groups. 

6. Four of the six forgotten words are also 
on the list of nine persistent errors. (There 
are fifty words in the test.) 

7. The wide range of correlations between 
intelligence and low percentage of error indi- 
cates that the capacity to forget spelling is 
fairly well distributed. 
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8. The indifferent correlations between 
high intelligence and low percentage of error 
might be compared with the negative correla- 
tion between achievement in spelling and in- 
telligence. It appears that the pupils who 
learn spelling tend to be the slower learners, 
and that the pupils who persist in error or 
who forget are just as likely to be the brighter 
learners as the slower ones. 


Summary.—The first survey of achieve- 
ment, in the 10A grade, revealed that the 
pupils were below the national norm in punc- 
tuation, capitalization, and spelling, and 
approximately at the national norm in usage. 
The 11A survey showed a mean score for the 
group still below the 1oA national norm in 
spelling. At the beginning, then, the situation 
was neither very good nor very bad. At the 
end of the study the situation had improved, 
but, since the intelligence level of the school 
is considerably above average, the amount of 
growth was somewhat surprising. 


The 1.Q. group, 100-91, made the greatest 
growth during the period of the study; this 
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is the best group of the so-called “extensive’’ 
pupils, for whom a course of study with more 
drill on skills has already been provided than 
is given to the intensive pupils. It would seem 
advisable to provide more time for written 
composition in all classes and semesters until 
individual mastery of the basic skills is 
assured. 

Of special interest, because of the psycho- 
logical and pedagogical implications, was the 
discovery of patterns of error by individual 
pupils. Specific weaknesses were identified by 
item analyses of each of the four subtests. As 
would be expected, some items were more 
difficult than others, but it is noteworthy that 
the items which ranked high in percentage of 
error at the beginning ranked high also at the 
end. Closer scrutiny of the tabulation of 
items showed that not only were the same 
items missed initially and finally, but also 
were missed, in a high percentage of cases, by 
the same pupils. Furthermore, the repeating 
offenders were found to be widely scattered 
among all intelligence levels. Forgetting was 
also widely distributed among all 1.Q. groups. 
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THE NUMBER OF DAYS REQUIRED TO ACHIEVE 
COMPLETE-LEARNING OF THE WEBSTER SYSTEM 
OF DIACRITICAL MARKS 


RALPH W. House and Epwarp E. GuNTER 


Appalachian State Teachers College 
Boone, North Carolina 


THE PROBLEM 


Pupils must achieve complete-learning of 
some system of diacritics, if they are to make 
an independent analysis of an unfamiliar 
word. The term, complete-learning, implies 
that the pupils can read the complete sym- 
bolization taught without making an error. 
Hence, in this study, the problem was to de- 
termine the number of days of instruction 
that would be required by a pupil in achiev- 
ing a complete-learning of the Webster system 
of diacritics. 


THE EXPERIMENT 


Selection of pupils—aAll the pupils in one 
sixth-grade classroom, and all the pupils in 
one fifth-grade classroom, were included in 
this study; however, two exceptions were 
made. The two exceptions were: (1) pupils 
with an IQ below 75 were excluded; and (2) 
pupils who had been unsuccessful in their 
school work were excluded. Fifty-seven pupils 
were used in the experiment; they were en- 
rolled in the Appalachian Demonstration 
School. 

Measurement of Growth—A twenty-five 
word pronunciation test was administered at 
the beginning and during the latter part of 
the study. The examiner tested one pupil at 
a time. The test used is presented below. 


Worp-Anatysis Test, Form B 


. paf 3. ging 

5. chel 6. doul 
. K6f’ dm 8. mfr’ &n 9. poo kal’ 
. mit zén’ 11. da vel’ 12. br’ dz 
. Bn’ 14. wi’ né 15. hu noth’ 
. kw¥r! dk 17. bar’ ra bool 18. pe ta vo 
58 dd de 20. rez’ 2 21. stv! & Wn 
. gool’a n¥ 23. ka rj me 24. fa fej S1 
. db zhd nf’ 2 as in about, sofa 
Pupils were required to make a detailed, 


oral analysis of each word in the test. Such 
a procedure, it was thought, would enable the 


examiner to determine the proficiency with 
which each pupil could read the symbols and 
employ the techniques taught. 

The examiner tested a number of the most 
successful pupils at the end of the seventh 
week, but none had achieved complete- 
learning. Each Friday, thereafter, pupils were 
selected by the critic teachers for the exam- 
iner to test. If a pupil did not succeed the 
first time, he was tested the following Friday, 
etc., until he did pass the test or gave evidence 
that he could never pass the test. 

Methods of instruction—The organic ap- 
proach and the acoustic approach were used 
in teaching each English speech sound. The 
organic approach, and the acoustic approach, 
were explained in a previous study (1). 

A planned order of attack was followed 
during each lesson period. (1) The teacher 
explained the most characteristic phase of the 
position and movement of the speech organs 
in the production of the English speech sound 
presented for mastery by the pupils during a 
recitation period. The teacher uttered the 
sound both in isolation and by blending the 
sound with other sounds with which the 
pupils were familiar. This gave the pupils an 
auditory experience with known and unknown 
words. (2) Rapid drill on the nonsense words 
was conducted by the critic teacher. The 
teacher kept each pupil alert for his turn to 
analyze a nonsense word by not letting him 
sense when his turn would come. Drill on the 
nonsense words gave pupils a combined visual 
and auditory experience in reading symbols 
and blending the sounds that the symbols 
represented. (3) In each recitation period a 
review was made of the most difficult words 
studied in the three preceding lessons. (4) 
Each recitation period was closed by the 
critic teacher dictating new or unfamiliar 
nonsense words for the pupils to write. Pupils 
were required to write the new, nonsense 
words in the Webster complete symbolization. 
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The dictation of twenty new, nonsense words 
to be written in the Webster system of diacrit- 
ics gave the pupils a combined auditory, 
visual, and kinesthetic experience in reading 
symbols and blending sounds. 


As the study progressed, some plan of moti- 
vation seemed necessary to keep the less suc- 
cessful pupils’ interest at a high level. The 
critic teachers proposed a series of contests 
similar in plan to a basketball tournament. 
Beginning with the eleventh week of the 
study, the pupils in the two grades were 
divided into seven teams. The number of 
fifth-grade pupils and the number of sixth- 
grade pupils on each team was kept as evenly 
balanced as the total number of pupils in 
each grade would permit. The critic teachers 
appointed a captain for each team. 


The first seven days in the last two weeks 
were used by each team as a practice period 
for final grooming of its teammates. The cap- 
tains directed the review work that their 
teammates did in final preparation for the 
pronunciation tournament. The critic teach- 
ers supervised the work of each captain and 
his or her team. 


The pronunciation tournament was run-off 
in the same manner as a basketball tourna- 
ment. The run-off covered a three-day period, 
or the last three days in the twelfth week 
during which this study was in progress; the 
time allotted each day for the tournament was 
forty minutes. The time allotted to the rapid 
drill on the diacritical marks was twenty-five 
minutes daily for fifty-seven days. 


Instructional materials—A detailed ex- 
planation of the instructional materials used 
in this study can be found elsewhere (1). In 
addition to the instructional materials de- 
scribed at length in the same source, twenty 
one-syllable nonsense words were dictated to 
the ‘pupils near the close of each recitation 
period. The drill on the polysyllabic words 
was supplemented by the use of 130 words 
taken from the Gazeteer of Webster’s New 
Collegiate Dictionary, Fifth Edition. 

In this study the drill work or systematic 
instruction on the diacritical marks was sup- 
plemented by a functional use of what had 
been learned during the drill period. Unfa- 
miliar words encountered in health, science, 
geography, and history were analyzed by the 
pupils as a group and under the supervision 
of their critic teachers. 
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Length of experiment.—This experiment 
began on January 31, 1944, and closed on 
the following April 21. Tables I and IT show 
the length of time in days required by each 
pupil to achieve a complete-learning of the 
Webster system of diacritical marks. As 
pupils took their test and gave evidence of 
having achieved complete-learning, they con- 
tinued their reading of the symbols, in order 
that they might achieve over-learning in this 
work. Over-learning should be the teacher’s 
goal. No effort, however, was made in this 
study to evaluate the degree of over-learning 
achieved by the more successful pupils. 


Tue Data 


In Table I pupil 3 has the highest mental 
age and also the highest grade-placement 


score in general achievement. She achieved 


complete-learning in 32 days, which is the 
best record made by any pupil in either the 
fifth-grade or the sixth grade. Pupils 1 and 7 
were the only two learners in the sixth grade 
with normal mental ability who achieved 
complete-learning on the 4oth day of the 
study. Pupils 1, 6, 10, 11, and 34 achieved 
complete-learning in less than 40 days. Pupil 
34 was absent 21 days previous to the last 
two weeks of the study, making it impossible 
for him to take the test before the 55th day; 
he achieved complete-learning in 39 days. 

Pupil 28 has an IQ of 114; he was absent 
two days and achieved complete-learning on 
the 55th day. Pupil 28, in the opinion of his 
critic teacher, had the habit of calling an un- 
familiar word by any name that occurred to 
him to be appropriate. Pupil 28 did not seem 
to see the need for him to master the diacrit- 
ics taught in this study. 

Pupils 22, 26, 31, and 34 are rated as dull 
normal, having 1Q’s of 75, 76, 77, and 75 
respectively. They achieved complete-learning 
in 48, 43, 54, and 39 days respectively. 
Pupils 26 and 34 rate as dull normal; they 
did work comparable to that of pupils 2, 5, 
8, and 10 with IQ’s of 129, 121, 129, and 115 
respectively. 

In Table II, pupils 36 and 38 achieved 
complete-learning in 37 and 38 days respec- 
tively, making the best records for the fifth- 
grade pupils. Pupil 56 is the only pupil in the 
fifth-grade rated as dull normal mentally; he 
achieved complete-learning in 59 days. Pupils 
50, 51, 52, 54, 55 and 57 are normal pupils 
mentally, as evidenced by IQ’s of 111, 110, 
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TABLE I 


A COMPARISON OF THE PRONUNCIATION GROWTH, MENTAL GROWTH, EDUCATIONAL 
GROWTH, AND NUTRITIONAL GROWTH OF 35 SIXTH-GRADE PUPILS 


Pronunciation Growth 


Mental Growth‘ 


Educational Growth! - 





Pupil Initial Complete Days 
Test Learning Pres- 


Score! Achieved? ent? 


C. A. 


12-0 
11-3 
11-11 
11-8 
11-3 
11-0 
12-1 


12-7 
14-6 
15-6 
13-6 
13-7 


COAIHPowm Oror 


12-3 
12-7 
11-2 
12-1 
13-1 
11-3 
13-11 
13-2 
11-10 
11-8 
12-7 
12-10 
12-9 
13-7 
11-2 
39 13-5 
52 11-4 


1 Initial test score is given in words. 


7 
4 
4 
8 
7 
5 
7 
6 
9 
4 
3 
2 
3 
4 
3 
2 
8 
1 
6 
0 
6 
1 
a 
4 
4 
2 
3 
3 
3 
2 
2 
3 
3 
5 
1 


M. A. 


Nutri- 
General tional 
IQ Read- Eng- Spell- Achieve- Growth‘ 
ing is ing ment 
105 ¢. 
129 
130 
116 
121 
122 
104 
129 
105 
115 
108 
94 
95 
127 
103 
102 
117 
86 
87 
110 
110 
75 
82 


~" 
oe 


AAPM NAD AAIARD PS AIBA AAHIIAW AAA HIINIANHWeN: 
or ~200 ION AMAMNYWH DWN WDEPOFWWHONNDHOMWAIMNOWOS 
AKAD MMARPSAAAASAAAMHASPAMAMAINSINGHOA F 
HNO DROHANPARMUNMAIDHOOMNOHMNOONNNALOHRS BF" 
ANE PARAAAORAAHARAPMAAMAH WH AAAW 990 
MOH HM DOH ON AH PNOAOAWOWMHWNONINAH WO 
AAPM AMPAMMAAAASAATAMMAAMPASASAIASAN 
ow-] > CHOKRORPANSCAAMWACWAUWDHWAAWDANHOKOHAH 


? Complete-learning (100 per cent mastery) is given in days, e. g., 40th day of the experiment. 
+ “ Days present” column gives the number of days each pupil was present during the experiment. 
* Mental growth is given in years and months; Otis Self-administering Test of Mental Ability, Form 


A was iven. 


ducational growth is given in terms of grade - the Metropolitan Achievement Test, In- 


et Battery Complete, Form A (Revised) was gi 


* Nutritional growth is given in grams of thee 5 riz. 5 to 13.5 was taken as the norm. 


109, 107, 108, and 114, but their lack of suc- 
cess, in the opinion of their critic teacher, was 
due to an over-sensitivity relative to achiev- 
ing success quickly in this study. Pupil 57 was 
nearly always absent when the examiner came 
to administer the test, which probably 
accounts for the fact that he achieved 
complete-learning in 56 days. 

An examination of the nutritional growth 
of each pupil, as evidenced by their hemo- 
globin readings, reveals some facts not 


directly related to the purpose of this investi- 
gation, which may be of interest to teachers. 
Table I shows that thirteen or 41 per cent of 
the pupils who were given a hemoglobin test 
in Grade 6 had a hemoglobin reading of 9.5 
grams or less. Table II reveals that eight or 
50 per cent of the pupils given a hemoglobin 
test in Grade 5 had 9.5 grams or less of 
hemoglobin. 

Kolmer (2:26) states that the norm for 
the children used in this study is 12.5 to 13.5 
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TABLE II 


A COMPARISON OF THE PRONUNCIATION GROWTH, MENTAL GROWTH, EDUCATIONAL 
GROWTH, AND NUTRITIONAL GROWTH OF 22 FIFTH-GRADE PUPILS 


Pronunciation Growth 


Mental Growth‘ 


Educational Growth® 
Nutri- 





Initial Complete Days 
Test Learning Pres- C.A. 
Score! Achieved? ent? 


M. A. 


10-3 
11-0 
10-10 
10-5 
11-1 
10-5 
10-9 
11-6 
10-2 
10-10 
12-3 
12-0 
10-0 
11-11 
10-1 
10-5 
10-6 
10-7 
10-8 
10-5 
11-11 
10-1 


1 Initial test score is given in words. 


13-0 
12-0 
13-5 
12-9 
13-8 
12-4 


10-3 

9-9 
13-0 
11-8 
10-0 


10-2 
11-2 
11-6 
11-5 

9-3 
11-5 
11-3 

9-4 
11-6 
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10-10 


9-10 


General tional 
Spell- Achieve- Growth! 
Ing ment 


IQ Read- 4 
ing 


aes 


127 
109 


a 
— 
ad 
nr 


DDN NA NNN NNN Ih ND IIS. a 
ANON AOE NWAWWWO-IOWOWNWOWH 
AL MAAAAA HABA ANHAIM HAIGH F 
AQP OADIAAHK WONOCOCHKDWOr Or 
LPL PRADA MAE ANAT NR NMA MD HN 
AN AIAINOCAKHWOAGCAKHCOKROCHKUF 
PPRAAMANAR NATH NIE MMA DD ID 
Dav ke ATICOKHNTICArPOWDW-IWN 


114 


s Complete-learning (100 per cent mastery) is given in days, e. g., 40th day of the experiment. 


+ “‘ Days present” column gives the number of days each pupil was present during the ex 
4 —- growth is given in years and months; Otis Self-administering Test of Mental 


A was 


riment. 
bility, Form 


dueationsl growth is given in terms of grade placement; the Metropolitan Achievement Test, In- 


uenaiinan Battery Complete, Form A (Revised) was 
6 Nutritional growth is given in grams of hemogl 


grams of hemoglobin. Youmans (3:27) says 
that a low hemoglobin reading is an indicator 
of an iron deficiency anemia. Youmans also 
states that even a small iron deficiency 
anemia may greatly affect a pupil’s health. 


CONCLUSIONS AND RECOMMENDATIONS 


An evaluation of the data obtained in this 
experiment seems to warrant the following 
conclusions and recommendations: 


1. Success in reading, English, spelling, 
and general achievement is a reliable predic- 
tive criterion of rapid progress in mastering 
diacritics. Pupils who are low in all four areas 
are likely to achieve success in mastering 
diacritics very slowly. However, no one area 
seems to stand out as the best single predic- 
tive criterion of success in learning to use the 
diacritical marks. 

2. Elementary-school pupils who are 
achieving success in their school work can 


ge 12.5 to 13.5 was taken as the norm. 


arrive at a complete-learning of the Webster 
system of diacritics in a reasonable length of 
time. 

3. The sixth-grade pupils gave a greater 
manifestation of interest and effort in master- 
ing the system of diacritics taught than did 
the fifth-grade pupils. 

4. It is the opinion of the critic teachers 
who participated in this study that the sixth 
grade may be the most satisfactory level at 
which to attempt a complete-learning of a 
system of diacritics as a phonetic aid in mak- 
ing an independent analysis of an unfamiliar 
word. 


5. The 


sixteenth-century 
used in our elementary-school textbooks and 
daily newspapers wastes one’s time; it causes 
both pupils and adults to glide over words 
for which they do not have the correct pro- 
nunciation, rather than take the time to look 
the words up in the dictionary. We need a 


symbolization 
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new symbolization that would have one letter, 
and only one letter, for each English speech 
sound, e.g., the phonetic spelling to the right 
of each word in Webster’s New International 
Dictionary, Second Edition. 

6. Teachers are urged to examine the Inter- 
national Phonetic Alphabet as an example of 
an alphabet that could be used in lieu of our 
sixteenth-century symbolization. The Interna- 
tional Phonetic Alphabet can be found on the 
front page of “A Guide To Pronunciation” in 
Webster’s New International Dictionary, 
Second Edition, p. xxii. 

7. Teachers should know the most charac- 
teristic phase of the position and movement 
of the speech organs for each English speech 
sound. 

8. Every pupil used in this study was suf- 
fering from malnutrition, ranging anywhere 
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from a mild to a severe iron deficiency 
anemia. 

g. An iron deficiency anemia is an indi- 
cator of the existence of other anemias. If 
the anemias were removed, would learning 
take place more rapidly? 
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CHILDREN’S RATINGS OF ASSOCIATES 


M. AMATORA TSCHECHTELIN 
St. Francis College, Fort Wayne, Indiana 


How does the child appear in the eyes of 
his peers? This is a question teachers often 
have asked, yet experiments showing the 
estimates by boys and girls of their class- 
mates are slow in forthcoming. 

In the present investigation the writer 
sought to ascertain just this. How do boys 
rate boys, and how do they rate girls on the 
various items that go to make up the child’s 
personality? And, vice versa, How do girls 
rate boys and how do they rate girls? In 
studying this problem, the 22-Trait Personal- 
ity Rating Scale was used. The reliability of 
this scale is treated elsewhere." 

The answer to the first question may be 
given by reference to Table I. Preparation of 
this table required the computation of class 
means for boys’ ratings of boys and of girls 
separately. Each mean (Mg, and Mg»)? is 
based on 1,000 boys’ ratings and 1,000 girls’ 
ratings in each of Grades IV, V, and VI; on 
600 boys’ ratings and 600 girls’ ratings in 
Grade VII; and on 400 boys’ ratings and 400 
girls’ ratings in Grade VIII. A glance at this 
table as a whole surprises one somewhat for, 
contrary to the usual expectation, the ratings 
by boys do tend to favor the boys. However, 
closer scrutiny does reveal somewhat of a 
trend, for it will be noticed that the fourth- 
grade boys rate the girls slightly higher on 
two items; the fifth-grade boys rate the girls 
higher on three items; the sixth-grade boys 
do so on eight items; the seventh-grade boys 
on eleven items; and the eighth-grade boys 
on eight items. When grades are combined, 
the girls are favored on only six items. Here 
is a steady increase of boys’ relative ratings 
favoring the girls from the fourth grade 
through the seventh grade, and a slight drop 
for the eighth grade. Perhaps this latter may 
be less surprising, when it is recalled that in 
this study the boys and girls are rating the 
boys and girls in their respective grades, 
whereas other studies show that girls of 
eighth-grade level are more interested in 


1M. Amatora Tschechtelin, “‘A 22-Trait Personality Rating 
Scale’, Journal of Psychology, 18 (July, 1944), 3-8. 

2 Capital letters indicate ratees; lower-case letters indicate 
raters. 


older, ninth- or tenth-grade boys. The peak at 
the seventh-grade level here, and the drop 
thereafter, may be an indication of similar 
tendencies in the adolescents of the present 
study. Further study of this table reveals 
grade-level trends in boys’ ratings of girls in 
certain items: the highest t value is that for 
Item 1, “pep’’; for all grades combined this 
is 10.36, which is highly significant (1 per 
cent level == 2.58). This is the only trait in 
which the difference is statistically significant 
at all grade levels. With the exception of 
Grade IV, Item 2 (intelligence) slightly 
favors the girls. However, none of these t’s, 
except that for intelligence in Grade IV, is 
significant. Only at the eighth-grade level do 
the boys consider the girls more sociable 
(Item 3). In Item 6 (religiousness), the only 
item in which the t favors the girls at all 
grade levels, there is a gradual rise in the t 
value until Grade VIII, when it drops 
slightly. In Grades IV and V tlhe boys judge 
the boys to be more polite (Item 7) and more 
neat (Item 13), while in Grades VI and VII 
the girls are so judged. Likewise are the boys 
more cooperative (Item 8) in Grades IV, V, 
and VI and the girls in Grades VII and VIII. 
In disposition (Item 16), there is a trend 
favoring the boys, reaching a peak at the 
sixth-grade level, and thereafter dropping in 
the seventh grade and favoring girls at the 
eighth-grade level. 

Table I may be summarized by stating 
that: 


1. The boys at all grade levels rated boys 
significantly higher than they did girls on 
“pep”. 

2. Exclusive of “pep”, there are only 4 of 
the 110 t’s for individual grades significant 
at the 5 per cent level. 

3. When grades were combined, 9 t’s are 
significant at the 1 per cent level, and 6 
others are so at the 5 per cent level. 


One might conclude that, even though some 
differences are small and decrease with 
ascendance in grade level, boys do tend to 
rate hoys higher than they rate girls. 
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TABLE I 
DIFFERENCES IN Boys’ RATINGS OF Boys AND OF GIRLS (Bb AND Bg) 


Grade 4 Grade 5 Grade 6 Grade 7 Grade 8 Grades 4-8 
“t” Favor “t” Favor “t” Favor “t” Favor” 
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In analyzing Table II, one discovers the 
general tendency of the girls to rate girls 
higher than they rate the boys. The trend is 
markedly stronger than it was in the boys’ 
ratings (Table I). Fourth-grade girls rate the 
boys higher on only the three items of “pep’’, 
sociability, and punctuality (Items 1, 3, 7); 
fifth-grade girls do so only on neatness (Item 
13); sixth-grade girls do so in no case; 
seventh-grade girls do so only on “pep” 
(Item 1); eighth-grade girls do so on the five 
items of “pep”’, intelligence, nervous-calmness, 
punctuality, and boisterous-quietness. Do the 
eighth-grade girls perhaps consider the boys 
of their own grade too calm and too quiet for 
their anxious spirits? Or are they perhaps 
comparing these boys with older boys of their 
acquaintance? Yet, it must be remembered 
that these t values, except for intelligence in 
the fifth grade, are not statistically significant. 
Only those t values throughout the table that 
are above 3.50 in Grade VIII are statistically 
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significant at the 1 per cent level. With this 
in mind, one perceives a slight general trend 
rising from Grade IV, reaching a peak in 
Grade VI, dropping in Grade VII, and becom. 
ing very low in Grade VIII. 

Even though most of these differences are 
small, when all grades are combined the t’s 
are significant in all cases except Items 1 
(pep) and 17 (good sport). The girls admit 
the boys are slightly more “peppy” and that 
the girls are but slightly “better sports”. 

In summarizing the ratings of boys and 
girls by girls from Table II, one might say 
that: 

1. The girls at all grade levels rated girls 
higher than they rated boys. 

2. In individual grades, 30 of the 110 t’s 
are significant at the 1 per cent level, and 20 
more are significant at the 5 per cent level. 

3. When grades were combined, 20 t’s are 
significant at the 1 per cent level. 





AN INVESTIGATION OF EXPERIMENTAL STUDIES WHICH 
COMPARE METHODS OF TEACHING ARITHMETIC’ 


MINNIE B. Knipp 
The Johns Hopkins University 


STATEMENT OF THE PROBLEM 


This study is concerned with those experi- 
mental investigations reported between 1911 
and 1940 that compare methods of teaching 
arithmetic in grades one through nine. It 
analyzes them in an attempt to answer the 
following questions: 


1. What changes have occurred in inter- 
ests, procedures, and results in experi- 
mentation concerned with the teaching 
of arithmetic during the interval from 
1911 to 1940? 

. What is the relationship of interests and 
procedures to the nature of the reported 
results? 


Various bibliographies of books and articles 
on the teaching of arithmetic, along with 
summarizing studies and their bibliographies, 
were consulted to locate experimental studies. 
The available research studies, conducted in 
the United States, that compared methods of 
teaching arithmetic were selected for investi- 
gation. From an original bibliography of 
approximately one hundred and fifty studies, 
fifty-seven fell within the limitations described 
above and included sixty-four experiments. 


PROCEDURE 


These sixty-four experiments were carefully 
analyzed, in order to ascertain whether there 
existed any trends in experimental interests, 
procedures, and results; and also to discover 
whether there seemed to be any relationship 
between interest in the specific fields of in- 
vestigation and reported results, or between 
procedures and these results. The analysis 
was tabulated in such a way that it indicates 
for each experiment the aspect of subject 
matter investigated, the grade level in which 
the experimentation was set, the methods 
compared, the kind of experimental control 


1 Abstract of a thesis submitted in partial fulfillment of the 
requirements for the degree of Doctor of Philosophy in the 
School of Higher Studies in Education of The Johns Hopkins 


ee, under the direction of Dr. Florence E. Bamberger. 
1 , 


émployed, the factors held constant in experi- 
ments using equated and paired group tech- 
niques, the results reported for the different 
levels of mental ability, the size of the ex- 
perimental group, the length of the experi- 
mental period, the kind of measuring instru- 
ment employed, the method adopted for 
measuring achievement, the statistical tests 
of significance applied to the reported differ- 
ences, and the nature of the results reported. 


The terms, “predominantly significant”, 
“predominantly insignificant”, and “no pre- 
dominant differences”, are used to give a 
clearer meaning to the nature of the reported 
results. If an experimenter has stated, or im- 
plied, that a significant or an insignificant 
difference was found to exist or, if he has re- 
ported several results more than half of which 
he has claimed are significant or insignifrcant, 
the terms, “predominantly significant” or 
“predominantly insignificant”, are used. If 
the investigator has claimed no difference in 
results, or if he has presented several results 
in such a way that those favoring one method 
were equal to those favoring some other 
method, they are classified as “no predomi- 
nant difference”. 


In order to determine the changes that 
have occurred in interests, procedures, and 
results, the total number of experiments was 
divided chronologically into groups compris- 
ing five years each and classified according 
to these interests, procedures, and results. 
Since not all of the studies state the date of 
experimentation, the date of reporting the in- 
vestigation was used; that is, the date of 
publication or the date of acceptance by a 
university for unpublished theses. This chron- 
ological arrangement was employed to dis- 
cover what differences were discernible in: 


1. Aspect of subject matter in which ex- 
perimentation was conducted. 


. Grade level in which experiments were 
carried on. 


. Methods compared. 
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4. Experimental techniques employed. 

a. Method of control. 

b. Factors held constant in experiments 
using equated and paired group 
methods of control. 

. Reporting results for different levels 
of mental ability. 

. Number of cases used. 

. Length of experimental period. 

. Kind of measuring instrument used 

. Method of measuring achievement. 

. Statistical tests of significance 
applied. 

5. Nature of reported results. 


The experiments were also sorted into 
groups representing each of the three kinds 
ot results (predominantly significant, pre- 
dominantly insignificant, and no predominant 
difference) in order to discern the relationship 
of the experimental interests and procedures 
to the nature of the results reported. Tables 
were then compiled, showing the number of 
experiments in each of these classifications 
and also indicating the various interests and 
procedures. 

In addition to the tabular analyses, each 
experiment was abstracted and commented 
upon. Several limitations in experimental 
procedure were noted and those that occurred 
rather frequently were summarized. 


RESULTS 


I. CHANGES NOTED IN INTERESTS, 
PROCEDURES, AND RESULTS 


1. Aspect of Subject Matter 


There was a rather steady decrease in the 
percentage of reports investigating separately 
each of the two aspects of arithmetic, (1) 
fundamental operations and (2) reasoning 
processes, accompanied by a gradual increase 
in the percentage of studies examining these 
two phases simultaneously. 


2. Grade Level 


In the earlier days of educational experi- 
mentation the investigators were concerned 
with methods of teaching arithmetic in the 
intermediate and upper grades (grades 4-9). 
After 1926, however, experiments were con- 
ducted in the primary grades (grades 1-3) 
almost as often as, or oftener than, in the 
higher grades. In all of the studies examined, 
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methods of teaching arithmetic in the first 
grade were given the least attention, and 
methods in the sixth grade the most emphasis. 


3. Methods of Teaching Compared 


Among the many different methods of 
teaching compared, drill of one sort or an- 
other interested the investigators throughout 
the whole thirty years, but this interest began 
to decrease in 1921, when experimenters be- 
gan to compare methods of teaching problem 
solving. 


4. Experimental Techniques Employed 


a. Method of control_—Experiments in- 
volving equated and paired groups were con- 
ducted throughout the thirty-year period. In 
each of the five-year intervals, except 1936— 
1940, the percentage of experiments utilizing 
equated groups exceeded that in which paired 
groups were used. From 1936 to 1940 this 
difference disappeared and the percentages 
were the same. There was also a gradual de- 
crease in the percentage of investigations in 
which equated groups were employed. 

b. Factors held constant—Three factors 
were controlled more frequently and more 
consistently than any other number of factors. 
The percentage of studies in which three 
factors were held constant increased from 
I9Ir to 1926, when it began to decrease. No 
decided trends as to which specific factors 
were controlled were apparent. Grade level 
and initial ability were consistently controlled 
in each of the five-year periods, but the per- 
centage of experiments in which they were 
held constant did not show any regularity in 
this practice. The percentage of studies in 
which intelligence (as indicated by mental 
age, intelligence quotient, or intelligence-test 
score) was controlled increased in each five- 
year interval, except from 1916 to 1920 when 
it was zero. 


c. Reporting results for levels of mental 
ability With one exception, investigators 
did not report results for the different levels 
of mental ability until 1926. After that time 
a number of them included such reports, but 
there was no regularity in their appearance. 


d. Number of cases used—Between fifty 
and one hundred cases were used in the larg- 
est percentage of the experiments. This num- 
ber of cases was used more consistently than 
was any other size of group. 
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e. Length of experimental period.—A large 
proportion of the early experiments were 
conducted over a period of from five to ten 
weeks, but after 1936 more of them length- 
ened the time of experimentation to approxi- 
mately one semester, which was preferred to 
any other length of time. 


{. Kind of measuring instrument used.—A 
rather pronounced use of standardized tests 
in the decade from 1916 to 1926 was ob- 
served. This practice may have been due to 
an increase in the number of standardized 
tests available and to the emphasis placed 
upon the importance of objectivity in mark- 
ing, which resulted from reports calling atten- 
tion to the unreliability of teachers’ marks.” 


g. Measure of achievement.—Approxi- 
mately two-fifths of the investigations re- 
ported results in terms of final test scores. 
Two of these experiments appeared as re- 
cently as 1940. Until 1925 central tendency 
was measured most frequently by per cent 
(per cent of accuracy, per cent of gain, per 
cent of superiority of one group over another, 
and per cent of pupils or groups of pupils). 
After 1925 central tendency was indicated 
more frequently by the mean. Before 1936 
two or three measures of central tendency 
were reported in several of the studies, but 
since then in each experiment only one 
measure, mean or median, has been given. 
Only twenty of the sixty-four experiments 
reported explicitly measures of variability. 
Twelve presented the standard deviation or 
probable error of the difference, or the ratio 
of the difference to its standard deviation or 
probable error, without giving these measures 
for the individual measures of central tend- 
ency. Nearly twice as many indicated the use 
of the standard deviation as reported the use 
of the probable error. Except during the five 
years from 1926 to 1930, there was a constant 
or increasing percentage of experiments re- 
porting the use of each of these measures of 
variability. 

h. Statistical tests of significance—No in- 
vestigation presented a test of the statistical 
significance of its results before 1926. From 
1926 to 1930 the ratio of the difference to its 
probable error was used more frequently than 

*D. Starch and E. C. Elliott, “Reliability of Grading High 
oy pore in English”, School Review, 20 (September, 

D. Starch and E. C. Elliott, “Reliability of Grading High 


School Work in Mathematics’, School Review, 21 (April, 
1913), 254—59. 
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any other measure. During the next ten years 
the ratio of the difference to its standard 
deviation was applied more often. No test de- 
signed to indicate the reliability of a differ- 
ence in small samples was employed. 


5. Nature of Results Reported 


A rather steady decrease in the percentage 
of experiments reporting “predominantly sig- 
nificant difference” was noted. After 1926 “no 
predominant difference” was reported in 
about the same percentage of experiments in 
each of the five-year intervals. “Predomi- 
nantly insignificant differences” were not in- 
dicated before 1916. They were claimed in 
each of the five-year periods after 1916, but 
the percentage of studies indicating such 
results was irregular. 


II. RELATIONSHIPS OBSERVED 


1. Aspect of Subject Matter 


Investigations concerned with methods of 
teaching fundamental operations generally re- 
ported predominantly significant or predomi- 
nantly insignificant differences, while those 
dealing with methods of teaching problem 
solving and those concerned with both phases 
of subject matter seemed to find no predomi- 
nant difference in the methods investigated. 


2. Grade Level 


Experiments conducted in the intermediate 
grades (grades 4, 5, 6) reported, in general, 
that there were both predominantly signifi- 
cant differences and also no predominant dif- 
ferences in the methods examined, while those 
carried on in the lower grades (grades 1, 2, 3) 
usually reported predominantly insignificant 
differences. In the upper grades (grades 7, 8, 
9) there was no particular relationship be- 
tween results and grade level. The sixth grade 
is the only level in which any striking relation 
to the nature of the reported results appeared. 
Approximately half of the investigations 
claiming predominantly significant differences 
took place at this grade level. 


3. Methods of Teaching Compared 


Experiments in which various methods of 
drill were compared, or in which comparisons 
were made of drill with no drill, claimed only 
predominantly significant or predominantly 
insignificant differences, and indicated pre- 
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dominantly significant differences more often 
than predominantly insignificant ones. Those 
investigations in which methods of teaching 
problem solving were studied disclosed, more 
generally, no predominant differences. 


4. Experimental Techniques Employed 


a. Method of control_—Experiments utiliz- 
ing equated groups reported each of the three 
kinds of results more often than those em- 
ploying any other method of control. A larger 
proportion of those finding no predominant 
difference than of those expressing predomi- 
nantly significant or predominantly insignifi- 
cant differences made use of this technique. 
Not one of the investigations using rotated 
groups or of those using unequal groups re- 
ported the kind of result designated as no 
predominant difference. Studies in which the 
method of control was not stated generally 
claimed predominantly significant differences. 


b. Factors held constant —Experiments in 
which less than five factors were controlled 
reported predominantly significant differences 
and predominantly insignificant differences 
approximately three times as frequently as 
those in which a larger number of factors was 
held constant, while those controlling more 
than five factors reported no predominant 
difference about three times as frequently as 
those holding constant less than five factors. 
A greater proportion of the investigations 
claiming predominantly significant differences 
controlled initial ability, and a larger propor- 
tion of the studies reporting predominantly 
insignificant differences held grade level con- 
stant. The greater proportion of those finding 
no predominant difference controlled the three 
factors of intelligence, previous achievement, 
and teacher. 


c. Reporting results for levels of mental 
ability —Investigations that included meas- 
ures to indicate achievement at the different 
levels of mental ability reported no predomi- 
nant difference in methods more often than 
did studies which did not investigate achieve- 
ment at the several mental-ability levels. Ex- 
periments that reported only for the whole 
group claimed more frequently predominantly 
insignificant differences. 

d. Number of cases used—Experiments 
conducted with less than one hundred subjects 
indicated predominantly insignificant differ- 
ences slightly more often than they indicated 
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predominantly significant differences and con- 
siderably more frequently than they reported 
no predominant difference. Those using be- 
tween one hundred and two hundred cases 
reported only predominantly significant dif- 
ferences. The studies involving between two 
hundred and five hundréd cases found most 
frequently no predominant difference. Inves- 
tigations carried on with more than five hun- 
dred pupils indicated one kind of difference 
just about as often as another. 

e. Length of experimental period —Experi- 
ments conducted over a six-week period or 
less indicated predominantly significant dif- 
ferences more often than predominantly insig- 
nificant differences. The experiments carried 
on for longer periods of time showed predomi- 
nantly significant differences less frequently 
than did studies conducted for shorter 
periods. 


{. Kind of measuring instrument.—On the 
whole, investigations in which standardized 
tests were used reported predominantly sig- 
nificant differences, while experiments in 
which non-standardized tests were employed 
reported predominantly insignificant differ- 
ences. 

g. Measuring achievement —Approximately 
two-thirds of the experiments reporting each 
of the three types of results measured gains. 
Among the studies measuring gains a slightly 
higher percentage reported no predominant 
difference, while a slightly lower percentage 
claimed predominantly insignificant differ- 
ences. The mean was used more frequently to 
measure central tendency than was any other 
measure. It was also employed in a larger 
percentage of studies revealing no predomi- 
nant difference, regardless of whether this 
difference was indicated in terms of gains or 
final test scores. A slightly larger proportion 
of experiments claiming to have found pre- 
dominantly insignificant differences included 
the standard deviation in addition to the 
measure of central tendency. The standard 
deviation was used more frequently than any 
other measure of variability. The same per- 
centage of studies finding both predominantly 
significant and no predominant difference re- 
ported standard deviation as presented prob- 
able error. 


h. Statistical tests of significance —About 


one third of the experiments reporting each 
kind of result submitted statistical evidence 
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of the significance of the findings. The ratio 
of the difference to its standard deviation was 
used to test statistical significance in a larger 
percentage of studies resulting in no predomi- 
nant difference than in studies claiming pre- 
dominantly significant or predominantly in- 
significant differences. This test was also 
applied in a larger percentage of experiments 
than was any other measure. 


III. LrmITATIONS OF EXPERIMENTAL WORK 


1. Forty-four experiments omitted measures 
of variability. 

2. Forty-two experiments omitted statistical 
tests of significance of the differences found. 

3. Twenty-five experiments measured 
achievement by means of a final test score, 
instead of determining growth as indicated by 
gain of final test score over an initial test 
score. 

4. Twenty-four experiments used non- 
standardized tests to measure achievement. 

5. Eleven experiments used the ratio of the 
difference to its probable error, or the ratio 
of the difference to its standard deviation, for 
small samples (32—85 cases). These measures 
assume a normal distribution of many cases 
and, hence, may not be applicable. 

6. Eight experiments failed to indicate the 
equivalence of the groups. 


CONCLUSIONS 


I. CHANGES IN INTERESTS, PROCEDURES, 
AND RESULTS 


1. Investigators now seem interested in 

examining: 

a. Both fundamental operations and rea- 
soning processes in a single experiment, 
as well as in studying separately meth- 
ods of teaching these two aspects of 
subject matter. 

. Methods of teaching arithmetic in the 
primary grades, as well as in the inter- 
mediate and upper grades. 

. Methods of teaching problem solving, 
as well as procedures for teaching drill. 


2. Persons conducting experiments to de- 
termine the superiority of one method over 
another are now generally: 


a. Making use of the paired-group tech- 
nique to the same extent as the equated 
group technique. 

b. Controlling more factors than formerly. 
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. Attempting to hold constant some 
measure of mental ability. 

. Reporting, to some extent, results for 
several levels of mental ability. 

. Using between fifty and one hundred 
cases. 

. Carrying on experimentation for approx- 
imately one semester. 

. Measuring achievement by means of 
standardized tests or both standardized 
and teacher-made tests, presenting this 
achievement in terms of mean gains, 
some of which are accompanied by the 
standard deviation. 

. Applying the critical ratio as a statistical 
test of significance of the differences 
observed. 


3. Experimenters are reporting no predomi- 
nant differences in an increasing percentage 
of investigations. 


II. RELATION OF INTERESTS AND 
PROCEDURES TO RESULTS 


1. Investigators interested in methods of 
teaching arithmetic in the sixth grade, and 
those interested in studying various kinds of 
drill, generally reported predominantly signifi- 
cant differences, while workers engaged in 
examining methods of teaching problem solu- 
tion seemed to find no predominant difference. 
The experimenters who compared different 
methods of teaching fundamental operations 
reported predominantly insignificant differ- 
ences nearly as frequently as predominantly 
significant differences. 

2. Results reported as predominantly sig- 
nificant were associated with experiments in 
which: 

a. The method of control was not indicated. 

b. Initial ability was controlled. 

c. Between one and two hundred cases 

were used. 

. The length of the experimental period 
was sixteen weeks or less. 

. Achievement was measured by means 
of standardized tests. 


3. Predominantly insignificant results were 
associated with experiments in which: 
a. Grade level was held constant. 
b. No reports were included for the dif- 
ferent levels of mental ability. 
c. Achievement was measured by non- 
standardized tests. 
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4. Reports of no predominant difference 
were associated with experiments in which: 

a. Groups were equated. 

b. More than five factors were controlled. 


c. The three factors of intelligence, previ- 
ous achievement, and teacher were held 
constant. 


. Reports were presented for the different 
levels of mental ability. 

. Between two hundred and five hundred 
pupils were used. 

. Achievement was determined by growth 
scores. 

. Central tendency was indicated by the 
mean. 

. The critical ratio was applied as a test 
of statistical significance. 


5. Both predominantly significant and pre- 
dominantly insignificant differences were 
associated with experiments in which less 
than one hundred pupils were used. 


III. Lim1tTATIONs OF EXPERIMENTAL WorK 


The limitations of experimental work as 
pointed out indicate that: 


1. More care should be exercised in con- 
ducting and reporting experimental investiga- 
tions that compare methods of teaching 
arithmetic. Many of the experiments that re- 
sulted in no predominant difference in meth- 
ods employed the equated group technique, 
controlled five or more factors (among which 
were intelligence, previous achievement, and 
teacher), determined achievement in terms of 
growth, and tested the statistical significance 
of their findings. These results suggest the 
possibility that, had some of the investigators 
reporting predominantly significant differ- 
ences been more careful about techniques and 
controls, used larger groups of pupils, and 
carried on their experimentation a longer 
time, they might have secured different 
results. 

2. Caution should be employed in apply- 
ing the results of experimentation concerned 
with the comparison of methods of teaching 
arithmetic. Owing to the limitations of many 
of these studies, it is suggested that all such 
investigations be read and reviewed in a very 
critical manner before any result is accepted 
or applied in practice. 
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IV. SUGGESTIONS FOR FURTHER RESEARCH 


Analyses of the following investigations 
should prove interesting, as well as valuable: 

1. Experiments comparing methods of 
teaching arithmetic that have been completed 
since 1940. 

2. Experiments comparing methods of 
teaching secondary-school mathematics. 

3. Experiments comparing methods of 
teaching other subjects at both the elemen- 
tary- and secondary-school levels. 
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NEW OBJECTIVES FOR NINTH GRADE MATHEMATICS: 
AN EXPOSITION AND APPRAISAL* 


WiLiiaM M. WILLITs 
Reading, Pennsylvania, Public Schools 


NEED FOR EXAMINATION OF NINTH GRADE 
MATHEMATICS OBJECTIVES 


The general nature of the problem under 
analysis in this study is indicated in the fol- 
lowing statements: 

1. Mathematics as a field of study has im- 
portant contributions to make toward the total 
of skills, habits, attitudes and generalizations 
everyone should have. Especially is this true 
throughout the junior high school years, where 
the fundamental skills and concepts relating 
to computation and measurement are added 
to and expanded. 

2. Tradition has influenced in undue de- 
gree the mathematics of the ninth grade. It 
has stood in the way of establishing objectives 
based on the needs of all pupils. 

3. Social and economic change, always 
ahead of the school, has outdistanced mathe- 
matics instruction to such an extent that, 
apparently, as the need for a_ universal 
knowledge of mathematical concepts becomes 
greater, the smaller is the percentage of pupils 
studying mathematics. Vitalization of objec- 
tives is of first importance, in order to estab- 
lish mathematics in the place it should have 
in modern life. The obligation of ninth-grade 
mathematics is so to vitalize instruction that 
the needs of all pupils are more nearly met. 

4. Social and economic change also has 
operated to retain more and more pupils in 
grades nine to twelve. New objectives must 
consider a pupil population much larger and 
much more variable than was the case a gen- 
eration ago. 

Statements of general objectives guiding 
the teaching of secondary mathematics have 
long recognized a dual aspect of objectives. 
There is, on the one hand, the factual or im- 
personal aspect, with the objectives embody- 
ing the practical ideas of skill, utility, and 
information. On the other hand, there is the 
personal or psychological phase, exemplified 
in statements relating to understanding, 

* Abstract of a dissertation accepted in partial fulfillment 


of the requirements for the degree of Doctor of Education in 
Teachers College, Temple University, January, 1944. 


habits, and appreciations. Courses of instruc- 
tion, presumably built upon this dual aspect, 
tend to emphasize the first phase and gener- 
ally to assume that the second is either a 
matter of teaching technique to be left to the 
individual instructor, or is a natural result 
growing from effective treatment of the first. 
That new thought is now given to the relative 
importance of the two phases is evident in the 
words of the Joint Commission of the Mathe- 
matical Association of America and the 
National Council of Teachers of Mathe- 
matics: 

A clear recognition of these two essentially 
different yet complementary types of objec- 
tives is one of the achievements of recent edu- 
cational theory. It is generally conceded that 
in the past the chief emphasis was on imper- 
sonal or factual objectives. . . . Educational 
advancement demands that due weight be 
given to both types of objectives.’ 

Moreover, forces of long standing have 
operated to give pre-eminent place to the 
factual type of objective. Among these influ- 
ences are, or have been, the concept of mathe- 
matics as mental discipline; the development 
traditionally of a rigorous, sequentially organ- 
ized body of mathematics easily measured by 
means of objective tests, for which a unit of 
credit could be assigned upon evidence of a 
certain degree of mastery; the domination of 
secondary mathematics by the requirements 
of college preparation, and the corollary idea 
that what is good for students preparing for 
college is equally good for the non-college 
student. Persistent is the idea that a mathe- 
matics course is not a mathematics course 
unless it is taught as a logically organized 
body of concepts, principles, and skills.” 

So far as ninth-grade mathematics in par- 
ticular is concerned, the power of these forces 
has been diminished by operation of other 


1The Place of Mathematics in Secondary Education. Fif- 
teenth Yearbook of the National Council of Teachers of 
Mathematics. New York: Bureau of Publications, Teachers 
College, Columbia University, 1940. 

® William Betz, “The Present Situation in Secondary Mathe- 
matics with Particular Reference to the New National Reports 
on the Place of Mathematics in Education’, Mathematics 
Teacher, 33 (December, 1940), 339-360. 
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influences. The great increase in school popu- 
lation, with increased variability in abilities 
and interests, has been perhaps the most 
powerful. It has led to warnings by Douglass* 
and Jablonower* that the traditional algebra 
of the ninth grade requires on the average a 
minimum I.Q. of ros to 110. There are also 
the factors of the expanding school curricu- 
lum, putting mathematics on the defensive as 
a school study; the physical reorganization of 
the schools, tending to place ninth-grade 
mathematics in an anomalous position, either 
at the terminus of the lower secondary years 
or at the beginning of the higher secondary 
years; and the dissatisfaction, voiced in some 
quarters, demanding a socialization of teach- 
ing procedures, and setting the problem as 
more than “a repackaging of the old product”’. 


THE New Osjectives Set FortH 


Any statement of ptoposed objectives for 
ninth-grade mathematics must take primary 
consideration of two facts; namely, that 
ninth-grade mathematics, under most plans 
of secondary-school organization, is a required 
course for all pupils, and that it is a terminal 
course for a great number of pupils. Add to 
these the fact that ninth-grade mathematics 
must contribute to the aims of general (sec- 
ondary) education, and a broad foundation 
is laid upon which to build. 

It is generally accepted that the junior high 
school should not aim at a great degree of 
differentiation in the curriculum; the learn- 
ing experiences provided by the school are 
largely those that everyone should have, 
leaving to the senior high school, or the last 
three years of the secondary-school period, 
the curriculum specialization predicated upon 
the individual’s more mature interests. This 
fundamental fact places upon the junior high 
school the responsibility for determining, to 
a far greater extent than upon the later sec- 
ondary school, what those common learning 
experiences shall be. As Wrinkle and Gilchrist 
put it: 

The obvious assumption underlying the 
requiring of certain courses in the students’ 
program ... is that everyone needs to know 
the things which are taught in the courses 

*Harl R. Douglass, Secondary Education for Youth in 
Modern America, p. 29. Washington: American Council on 
Education, 1937. 


* Paul Jablonower, ‘Recent and Present Tendencies in the 
Teaching of Algebra in the High School’, Seventh Yearbook 
of the National Council of Teachers of Mathematics, p. 2. 
New York: Columbia University, ‘1932. 
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which are required. . . . We have assume 
for so long in education that there are certain 
things which everyone needs to know that we 
have quit asking ourselves whether or not 
everyone actually needs to know them.° 


Mathematics has more than a hereditary 
claim to a place as a required course in the 
junior high school curriculum. The pupil gen- 
erally begins his formal mathematical learn. 
ing in the third grade. By the end of th 
sixth year of the elementary school his com- 
putational skill is not sufficiently developed, 
nor his stock of mathematical concepts sufi 
ciently enlarged, to warrant assumption that 
he has all the elementary skills he should 
have. His immaturity operates against the 
assumption that he can acquire, before the 
end of the eighth year at least, all the ele 
mentary skills he should have. In the seventh 
and eighth grades he adds to his knowledg 
of those skills. Thereafter, it is assumed that 
he has a working command of them; he 
directed, unless he is a “slow” pupil, into a 
course in algebra, with a vista of more alge. 
bra, geometry, and trigonometry before him 
in later years. 


The fundamental skills that everyone 
should have are acceptably established in the 
seventh and eighth grades, but the assump 
tion that a first course in algebra embrace 
the. mathematical knowledge everyone should 
have in the ninth grade is open to question. 
It would appear rather that the mathematics 
of this year is caught between tradition and 
the pressure of senior high school mathe- 
matics requirements. Douglass points out that 
“the prevailing high school curriculum in 
mathematics was formulated very much as it 
now exists in the quarter century immediately 
following the Civil War—1856—1890”, when 
high schools were largely “prep” schools en- 
gaged in getting the abler students ready for 
college, and when less than three per cent of 
the young people of high school age were 
graduating from high school. Today, of the 
millions in high school, “less than eighteen 
per cent will go to college and less than ten 
per cent will stay in college more than two 
years. The others will go into all walks of life. 
... At least a third of these do not have the 
ability and probably another twenty per cent 
can not be sufficiently interested to learn 


5 William L. Wrinkle and Robert S. Gilchrist, Secondary 
Education for American Democracy, pp. 206, 207. New 
York: Farrar and Rinehart, 1942. 
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enough algebra, geometry or trigonometry to 
be able to make use of it”.® 
If the requirement of algebra in the ninth 
srade is dictated by the force of tradition and 
the fact that less than one pupil in five needs 
it as a part of a sequence of mathematics 
courses for college entrance, the idea of uni- 
ersal need is sacrificed. Further, it poses the 
rather futile proposition of setting up objec- 
tives for a course required of all pupils when 
the objectives must necessarily be circum- 
scribed by the very nature and purpose of 
the course. 


If ninth-grade mathematics is something 
everyone should know, what has it to offer 
that everyone should know? It must offer 
more than a systematic body of mathematical 
learnings, mastery of which yields an end 
result thought of as mental development. It 
must offer more than a preparation for more 
mathematics, or for college. It must hold more 
than a promise of vocational usefulness. These 
are values in varying degree for different 
pupils; they do not represent equal values to 
all learners. Neither do they give direction to 
the sound objectives that have been stated, 
refined, and restated in the last twenty years. 
There is lacking a large, unifying concept, 
non-mathematical in nature, permeating the 
stream of mathematical concepts, and recog- 
nized as educationally good, the values of 
which the learner himself can _ recognize, 
accept, and realize to a satisfying degree. 
That concept is an expanded idea of the 
problem-solving process. 


Clear thinking is reflective thinking and 
reflective thinking is problem solving. Mor- 
rison says that “problem solving is essentially 
and fundamentally reflective thinking, and, 
conversely, reflective thinking is problem 
solving”. He goes on to say: “The condition- 
ing factors under which thinking takes place 
are: (1) something to think about; (2) a 
method of thinking; (3) inherent capacity to 
think at all; (4) a motive for thinking”.’ 
Problem solving as the unifying concept for 
ninth-grade mathematics provides at once a 
concrete “something to think about” and a 
strong “motive for thinking”. It involves a 
“method of thinking” that is difficult to de- 

*Harl R. Douglass, “The Double Track Plan of High 
an  - rrr Mathematics Teacher, 36 (February, 

*Henry C. Morrison, The 


a School, p. 251. 
Press, 1931. 


* Practice of Teaching in the 
Chicago: University of Chicago 
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velop as a pattern in a sequentially organized 
course of study. 

The heart of problem solving is the method 
of scientific inquiry, the so-called scientific 
method. Wrinkle and Gilchrist refer to it as 
the method by which the individual meets 
his personal and group problems intelligently, 
and as the characteristic mark of democratic 
behavior toward which the secondary school 
must direct its efforts. The larger objective of 
democratic behavior and its breakdown into 
specific behaviors are given by these authors 
as follows: 


1. The product of the secondary school 
meets his personal and group problems 
intelligently. 

1.1 He recognizes a problem. 
1.2 He formulates the problem. 
1.3 He makes plans for the solution of 
the problem. 
1.31 He selects appropriate sources 
of evidence. 
1.32 He decides how to secure the 
evidence. 
He collects the evidence. 
He organizes the evidence. 
1.51 He analyzes the evidence for 
significant ideas. 
1.52 He pulls the significant ideas 
together. 
1.6 He formulates a tentative conclu- 
sion. 
1.7 He checks the tentative conclusion. 
1.8 He states his conclusion to the 
problem.*® 


Two authoritative mathematical reports 
have appeared recently which recognize the 
values inherent in the problem-solving process 
as applied to mathematics instruction. The 
Joint Commission report, already referred to, 
considers the following general objectives of 
secondary education, as outlined below: 


1. Ability to think clearly. Some activities 
associated with clear thinking: 
a. Gathering and organizing data. 
b. Representing data. 
c. Drawing conclusions. 
d. Establishing and judging claims of 

proof. 

2. Ability to use information, concepts, and 
general principles. 

3. Ability to use fundamental skills. 


* William L. Wrinkle and Robert S. Gilchrist, op. cit., pp. 
246-248. 
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4. Desirable attitudes. 
a. Respect for knowledge. 
b. Respect for good workmanship. 
c. Respect for understanding. 
d. Social-mindedness. 
e. Open-mindedness. 
5. Interests and appreciations. 


















To apply these objectives to mathematics, 
the Joint Commission proposes a plan of 
organizing the materials of instruction in 
terms of two principles of classification. The 
first is according to seven major subject fields: 


I. Number and computation. 

II. Geometric form and space perception. 
III. Graphic representation. 
IY. Elementary analysis 

trigonometry ). 

V. Logical thinking. 

VI. Relational thinking. 
VII. Symbolic representation. 


(algebra and 


The second classification is according to a 
subdivision of the above fields into categories: 


I. Basic concepts, principles and terms. 
II. Fundamental processes. 
III. Fundamental relations. 

IV. Skills and techniques. 

V. Applications.* 


The second report, that of the Progressive 
Education Association, views general educa- 
tion as a means of providing “rich and sig- 
nificant experiences” in terms of: (1) per- 
sonal living, (2) immediate personal-social 
relationships, (3) social-civic relationships, 
and (4) economic relationships. In meeting 
their needs in the basic aspects of living, 
adolescents encounter certain problems, the 
solution of which is effected through the co- 
operative efforts of the departments of study 
provided in the secondary school. Mathe- 
matics, with its efficient symbolism and meth- 
ods, bears the responsibility for equipping 
pupils to solve those problems to which math- 
ematical concepts are applicable. Although 
the development of intelligence in analyzing 
problem situations, which is essentially re- 
flective thinking, is only a part of the purpose 
of general education, the report of the Pro- 
gressive Education Association considers it 
the “major unique contribution” that mathe- 
matics instruction possesses. In the opinion 


® Fifteenth Yearbook of the National Council of Teachers 
of Mathematics, op. cit., pp. 21-61. 
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of the Committee, “the study of mathematics 
can be made to throw the problem solving 
process into sharp relief, and so offers oppor. 
tunity to improve students’ thinking in all 
fields.” 


Mathematical concepts, according to this 
report, should serve a dual purpose. They 
should operate to unify instruction as it bears 
on problem solving, and lead at the same time 
to a knowledge of mathematics as a science 
apart from its applications. Concepts should 
be few in number, and have properties of 
systematic recurrence and wide applicability, 
Seven major concepts possess these qualities: 


I. Formulation and solution. 
II. Data. 

III. Approximation. 

IV. Function. 
V. Operation. 

VI. Proof. 

VII. Symbolism.*° 


It is worthy of note that the activities asso- 
ciated with the development of the ability to 
think clearly, as set forth in the report of the 
Joint Commission, are expanded to become 
the pervasive concepts of instruction accord- 
ing to the Progressive Education Association 
Report. The process by which the goal is to 
be reached is reversed in the two reports, 
however. The Joint Commission approves a 
sequential arrangement of mathematical con- 
cepts, principles, and processes, leading to a 
wide variety of applications, while the report 
of the Progressive Education Association in- 
dicates that a more effective learning situation 
is created through the planning of curricular 
sequences on the basis of concrete problems, 
rather than on the basis of logical sequences 
of the familiar sort. 


A course of study based upon the problem- 
solving process implies a reorganization of the 
materials of instruction. If we desire pupils 
to solve problems according to a pattern of 
thinking that is to function in their lives, we 
must teach them to solve problems according 
to that pattern. The unit of learning is no 
longer a unit of subject matter; it is the 
problem itself. This concept of the unit has 
been called the unit of adaptation by Jones, 
Grizzell, and Grinstead: 


10 Mathematics in General Education, pp. 20, 43, 59-68. 
Report of the Committee on the Function of Mathematics in 
General Education of the Progressive Education Association. 
New York: D. Appleton—Century Co., 1940 
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The unit of learning consists of a group 
or chain of planned, co-ordinated activities 
undertaken by the learner in order to obtain 
control over a type of life situation. The 
underlying principle is not the logical organ- 
ization of the activities themselves, often 
thought of as subject matter, nor a center of 
child interest, but the learning product to be 
achieved. This learning product is not merely 
a skill, a habit, an attitude, etc., but such an 
integrated combination of these as will result 
in an adjustment of the individual to a life 
situation. . . . The ability to meet a situation 
in life outside the school is not to be thought 
of as a generalization that results more or less 
automatically from a single unit developed in 
the school. It is the product of a series of 
units or planned activities, each differing 
from the others, just as life situations differ 
from one another.*? 

If the problem becomes the unit of learn- 
ing, what steps in the problem-solving process 
particularly can be brought into sharp focus, 
in order that the process may become a pat- 
tern of thinking for the ninth-grade learner? 
The steps may be enumerated and discussed 
briefly as follows: 

First, the learner should recognize a prob- 
lem situation. It may be a problem arising 
from his own immediate needs, or out of his 
present or future social-civic or economic 
needs. Whatever its source, the problem must 
be genuine; it must have the flavor of a 
situation someone may meet sometime, some- 
where. The problem may be framed as a 
question: “What should I know about con- 
sumer credit?” “How do Fahrenheit and 
Centigrade temperatures compare?” These 
probably are good problems for the general 
ninth-grade level. 

Second, the problem is formulated. Is the 
problem so broad as to require delimitation 
in order to insure acceptable solution? Pos- 
sible methods of solution are investigated. 
Does the problem yield a precise word state- 
ment? Can it be expressed as a formula? Does 
the formula yield a table of values for the 
variables involved? Or, conversely, does the 
problem yield a table from which a formula 
may grow? Car a graph be drawn? Of what 
value is the graph? What do the shape and 
slope of the graph tell about how the vari- 
ables are related? Is an equation involved? 

"Arthur J. Jones, E. D. Grizzell, and Wren Jones Grin- 


stead, Principles of Unit Construction, pp. 19, 20. New York: 
McGraw-Hill Book Co., 1939. 
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What kind of equation: linear, quadratic, a 
system, or radical? Can the equation be gen- 
eralized, i.e., can a formula be derived ex- 
pressing the relationships applicable to all 
problems of the type? 

Third, data (or evidence) must be col- 
lected. Sources of data will lie partly outside 
the mathematics text, in the library, in other 
texts, periodicals, and newspapers. Non- 
mathematical data (definitions and materials 
necessary for understanding the problem) 
and simpler mathematical data may be sup- 
plied by pupil groups assigned to gather and 
present them to the class. 

Fourth, solution is effected on the basis of 
the tentative solution. Solution involves sym- 
bolism and operation. These may call for 
techniques and methods not yet learned, and 
are taught for immediate use in the problem. 
From a mathematical point of view, problem 
solving means new learning when the situa- 
tion demands. New learning becomes at once 
meaningful. 

Fifth, the validity of the solution (or con- 
clusion) is tested. The test may be a review 
of the problem, noting that the conclusion is 
in agreement with the facts or data. It may 
be a check of the accuracy of an equation’s 
solution. When pertinent, proof might call 
attention to the fact that the process of solu- 
tion has proceeded from general premises and 
sought consequences, or that it has examined 
particular instances and arrived at general 
conclusions. At the ninth-grade level, these 
ideas probably should constitute the range of 
experience with deductive and inductive 
thinking. 

Sixth, when possible the problem is gener- 
alized. Generalization holds a dual aspect. 
On the purely mathematical side, a simple 
skill or operation may lead to a pervasive con- 
cept. A second aspect applies to the non- 
mathematical side; a conclusion drawn from 
study of a problem may lead to a fact of 
social, economic, or scientific pervasiveness. 

It is argued that any course of study which 
yields a systematic body of subject matter, 
logically arranged, in favor of one composed 
of life situations or activities, relegates math- 
ematics to a position of “incidental learning”’. 
It is held that, from arithmetic onward, con- 
cepts, skills, and principles are not inferred 
from concrete situations; they must be 
learned “mathematically, as related elements 
of a closely knit system of ideas and 





36 JOURNAL OF EXPERIMENTAL EDUCATION 


processes”.’* The continuity of mathematical 
thinking and learning need not be destroyed 
in the problem-solving process, Problem situ- 
ations, chosen with careful regard for their 
mathematical content, conceivably can pro- 
vide a satisfactory body of systematic learn- 
ings, with the added advantage of immediate 
use in genuine problem situations. 

Turning now to the relatively intangible 
objectives (co-operativeness, open-minded- 
ness, and social-mindedness) that mathemat- 
ics instruction hopes to realize, it is doubtful 
whether any great step is made toward their 
realization in present-day classroom proce- 
dure. Their development is circumscribed by 
the materials and techniques of instruction. 
The average ninth-grade mathematics class, 
immersed in the process of understanding and 
acquiring mathematical skills, principles, and 
concepts, certainly has little time or opportu- 
nity to give attention to matters that have 
little bearing within the confines of the sub- 
ject. For mathematics per se has little concern 
with personal traits, even though its roots are 
deep in human needs. It is only as the horizon 
is expanded to a greater socialization that 
these objectives may be realized. The 
problem-solving process offers such expan- 
sion; its method is social; its materials 
transcend mathematical boundaries. 

The problem-solving process centers atten- 
tion on the problem itself, its method of solu- 
tion, and its solution. It calls for a pattern 
of thinking, a step-by-step procedure, that in- 
volves mathematical operation and results in 
acceptable solution. Attention is thus directed 
to the nature of acceptable solution. 

At the ninth-grade level it is neither pos- 
sible nor desirable to consider all the ramifi- 
cations of acceptable solution. It is desirable 
to limit such concepts to those which the 
learner will most commonly meet. This study 
proposes the identification of problem situa- 
tions according to three types, and the selec- 
tion of problem situations for study that yield 
solutions of such types. 

The solution of a problem, in the minds of 
junior high school pupils, is a numerical re- 
sult, a sought-for quantity. It is arrived at by 
proper manipulation of given data, all seem- 
ingly precisely set forth, ready for use. The 
preciseness of the data is seldom questioned. 
The fact that data may be incomplete, or 
inaccurate because of difficulty of measure- 

2 William Betz, op. cit., p. 352. 
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ment, and the computed result consequently 
inaccurate, is not stressed. The fact that solu- 
tion of a problem may involve little or no 
computation, but is the result of proper con- 
sideration given to all data, is very commonly 
not a part of mathematics instruction. There 
is in these statements the embodiment of the 
several types of solution of which the learner 
should be aware in the course of his problem 
solving. A definitive statement of the problem 
types considered in this study is as follows: 

First, there are those problems in which 
the elements are few, concrete, and fixed; all 
the necessary data are known or are readily 
at hand; solution is clear-cut, precise, and 
valid under test. Second, there are those prob- 
lems in which the elements are so numerous 
or so varying that solution must be of neces- 
sity approximate, yet workable and accept- 
able, the best possible solution. Third, there 
are those problems involving little, often no, 
computation, whose quantitative aspects are 
probably not expressed in number, whose 
solution is based upon unprejudiced interpre- 
tation of all available evidence. 

All three problem types bear similarity, in 
that they stress the importance of data. The 
first two are similar in that they may, but 
must not necessarily, yield a numerical result. 
The solution may be a formula expressing a 
relationship derived from all the data, or all 
the known data; it may be a graph, which 
may or may not lead to symbolic formula- 
tion; it may be a table, which may or may 
not lead to more precise formulation. The 
third type leads to no numerical result, no 
formula, or graph, or table. It recognizes the 
importance of assumptions and the influence 
of bias and prejudice. It carefully weighs the 
evidence in the light of known assumptions, 
and consciously holds personal predilection to 
a minimum. To a degree it is the postulational 
thinking of geometry, but it is applied to the 
situations of life rather than to geometric 
materials. In brief, the three problem types 
cover the field of mathematical and non- 
mathematical data on a level suited to ninth- 
grade learning, and provide experience in 
reaching mathematical and non-mathematical 
conclusions. 


THE SETTING FOR THE APPRAISAL 


This study considers two pupil groups, one 
of which participated in the new learning 
experience, while the other engaged in the 
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study of first-year algebra according to the 
usually accepted textbook procedures. The 
results coming from comparison of these 
groups will be interpreted within the limita- 
tions thus set. 

Preliminary matching of two or more 
groups was not feasible for certain reasons. 
Pupils in the school in which the study was 
conducted* are sectioned during the summer 
vacation on two bases: according to curric- 
ulum (language, commerce, and general) and 
according to general ability (rapid, normal, 
and slow), as measured in part by teachers’ 
estimates in terms of marks from the preced- 
ing grade and in part by intelligence quotient. 
From the ten sections of some 300 pupils con- 
stituting the ninth-grade class in September, 
two were selected on the basis of these very 
general criteria: (1) the sections were desig- 
nated as “normal” (of average general abil- 
ity); (2) the curricular interests of the pupils 
were similar; (3) the likelihood of ninth- 
grade mathematics being a final course in 
mathematics for the majority of the members 
was strong. 

Further investigation disclosed that the 
groups were somewhat dissimilar in terms of 
the results of objective measurement. The 
group showing the less ability was chosen to 
engage in the new study, and is hereafter 
designated as “Group A’’; the slightly supe- 
rior group, designated “Group B”, pursued 
the school’s regular course of study in algebra 
under another instructor. 


TABLE I 


DISTRIBUTION OF PUPILS IN Groups A AND B 
ACCORDING TO SEX AND CURRICULUM 


Sex Number 
Boys 16 
Girls 23 


Group Curriculum 


Commerce 
General 


Boys 16 
Girls 21 


General 
Commerce 


In September Group A had forty-three 
pupils; Group B, forty-two. At the close of 
the year in June, transfers and drop-outs re- 
duced these numbers to thirty-nine and 
thirty-seven respectively. Their distribution 
according to sex and curriculum choice is 
shown in Table I. Schedules of the two cur- 
riculums were equal in periods-per-week, 
differing only in point of one subject. 


*Southern Junior High School, 


1941-42. Reading, Pennsylvania, 
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The range of ages (as of September) was 
from thirteen to seventeen, and closely similar 
in the two classes. The fourteen-year group 
predominated, with only one over-age pupil 
in each class. All pupils were in the ninth 
grade and were taking ninth-grade mathe- 
matics for the first time. 

Table II shows the median, and first and 
third quartiles, with the range of intelligence 
quotients as determined by the Otis Self- 
Administering Test of Mental Ability, Form 
B of the test having been given the preceding 
year. Also available were the scores on the 
Metropolitan Advanced Arithmetic Test, 
Form A of which was given in the sixth month 
of the eighth grade. Table IT reveals the slight 
superiority of Group B with respect to these 
two measures. 


TABLE II 


INTELLIGENCE QUOTIENTS AND METROPOLITAN 
ADVANCED ARITHMETIC SCORES OF PUPILS 
IN Groups A AND B 


Group A Group B 

84—122 
101 
108 


Otis Intelligence Scores 


Range 
First Quartile 
Medi 


Of passing interest is the composition of 
the two classes according to national and 
racial backgrounds. Approximately forty-five 
per cent of Group A was of first and second 
generation south- and east-European extrac- 
tion, chiefly Italian and Polish. Group B in- 
cluded thirty-five per cent of such back- 
ground. There were two Negroes in Group A, 
but none in Group B. 


CLASSROOM PROCEDURES 


This section describes the specific classroom 
methods and techniques used in the develop- 
ment of the course of study. 


No textbook was used as a class text. Each 
problem selected for study was mimeographed 
according to a form that varied but little 
from problem to problem. The problem form 
followed generally the step-by-step technique 
of the problem-solving process. The pattern 
of doing and thinking, therefore, soon became 
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habit. The problem itself ' was invariably 
stated in question form, in ordér to aid in 
recognition and to challenge attention. No 
data were given, but reference to sources of 
information was made. 

Supplying of data in the form of necessary 
information, definitions, and so on was left 
to a pupil group. Four pupils, serving as a 
steering committee, had the responsibility for 
looking up the source materials, finding the 
answers to a series of questions provided on 
the problem sheets, and presenting the infor- 
mation orally to the class at the beginning of 
each problem. Source materials were found 
either in the library or in the mathematics 
classroom, and the committee group assigned 
to a problem gathered the necessary data after 
school hours in time for the first presentation 
of the new problem. Data required were 
simple in nature, since these pupils were 
neither expert readers nor experienced in 
finding data. A new committee served for 
each problem, and in the course of the year 
each pupil had opportunity to serve twice. 

Discussion followed the committee’s pre- 
sentation, out of which grew understanding 
of the nature of the problem, necessary de- 


limitations, if any, and possible methods of . 


solution. : 

The problem sheets provided guides for the 
necessary operational technique. Space was 
provided also for performing computation or 
any other operation. If new mathematical 
learning was met, it was presented on the 
sheets at this point, together with the neces- 
sary drill materials. At times drill exercises 
were assigned, to be submitted to the teacher. 
Tests on new skills followed immediately the 
teaching of the skill. 

Each problem ended with a discussion dur- 
ing which the problem was summarized, the 
results checked in the light of given data, 
the type of problem identified, and generaliza- 
tions drawn when possible. 

With the conclusion of a problem, each 
pupil as a result of his work had a complete 
written picture of the problem. He had taken 
down the information received from the steer- 
ing committee, with notes and comments; he 
had performed the necessary operation, had a 
record of new mathematical skills learned, 
and had notes and comments on the final 
discussion. 

All problems were submitted on completion 
and were marked by the instructor. Problems 
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were turned back if uncorrected or otherwise 
incomplete. During the course of solving the 
problem, opportunity was given for correction 
of any computational or operational work, 
and a requirement was that such correction 
had to be made. 

A mimeographed test on the problem fol- 
lowed its completion. The problem test cov- 
ered: (1) understanding of the problem, (2) 
skill in the new mathematical learnings pre- 
sented in the problem, and (3) possible gen- 
eralizations. A biweekly test on arithmetic 
fundamentals completed the testing program. 

Outside-of-class work was _ required, 
although class time was given to supervised 
study, in addition to discussion and presenta- 
tion of new materials. Assignments on certain 
parts of the problem, due the following day, 
were made in order to insure the problems 
contpletion on time by everyone. Assignments 
were made approximately two-thirds ‘of the 
time. 

The foregoing procedure developed a social- 
ized situation into which the pupils entered 
in a gratifying manner. Pupils were impressed 
with the fact that a completed problem repre- 
sented something done—a situation under- 
stood or a piece of learning, mathematical and 
non-mathematical, consummated—a_ value 
difficult to realize in a conventionally organ- 
ized course of study. 


THE COURSE OF STUDY 


The mathematical content of the problems 
selected for study was determined on the basis 
of three criteria, namely: (1) What content 
will best contribute to analysis of the data 
encountered in the problems? (2) What con- 
tent should the course provide, as a terminal 
mathematics course, in order to meet the 
needs of pupils in future school work and 
later? (3) What content should the course 
provide, in order to insure a reasonable prepa- 
ration for future mathematics for those who 
continue the study of mathematics? 

The first of these criteria was easily met. 
Analysis of problem data results in any one 
or more of the following solutions: 

. A precise word statement. 

. A tabulation. 

. A graph. 

. A symbolic statement of general appli- 
cation: the formula. 

. A symbolic statement of particular or 
local application: the equation. 
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These are the broad fields of operation 
dealt with in a course in elementary algebra. 
They do not imply the organization and over- 
all content of elementary algebra, however. 
Sequential subject-matter organization is con- 
cerned primarily with the systematic study 
of a topic as far as the ability and maturation 
of the learner will permit, with the elaborate- 
ness with which specific items of learning 
shall be undertaken, and with applications 
(their number and degree of difficulty) that 
shall be made. These are precisely the ele- 
ments that confuse the average learner. He 
becomes lost in the detailed process of doing. 
He is impressed with a succession of skills 
which he is told have important application; 
meeting the application, he finds it necessary 
to develop a skill of another kind. Ability to 


‘understand and skill to perform an algebraic 


process are one thing; ability to apply it is 
another. The process of problem solving goes 
a long way toward resolving the dilemma. It 
bridges the gap between process and applica- 
tion. It circumscribes the “how far” and the 
elaborateness of the study of processes and 
topics. 

In determining the content to meet the 
second and third criteria, Thorndike’s volume, 
The Psychology of Algebra,* and the report 
of the Joint Commission’* were referred to. 
On these bases, an outline of content in 
terms of abilities required for the analysis of 
problem data took the following form: 


1. A precise word statement may, in and 
of itself, constitute a solution to a problem 
based upon verbal data of mathematical or 
non-mathematical nature. Frequently it may 
lead to symbolic expression as a formula, 
when quantities and their relationship are 
known. Therefore, abilities should be devel- 


oped: 


. To express a statement of relationship 
in precise words. 


. To translate a word statement into 
appropriate symbolism. 


. Conversely, to translate a symbolic 
statement into a word statement. 


. Tabulation involves the recognition of 
relationship between number pairs. Therefore, 
abilities should be developed: 

% Edward L. Thorndike and Others, The Popeeaieey of 
Algebra, Chapter Il. New York: Macmillan Co., 


“ Fifteenth Yearbook of the + Council a Teedies 
of Mathematics, op. cit., Chapter V 
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a. To make tables of related number pairs 
from observed data. 

b. To make tables from formulas. 

c. To interpret such tables. 


3. The study of graphs involves the further 
understanding of the relatedness of number 
pairs and of other relationships. Therefore, 
abilities should be developed: 


. To construct graphs based on tables. 

. To construct graphs based on formulas. 

. To use graphs in the study of equations. 

. To read and interpret graphs within 
their limits of practicability. 


4. The study of formulas involves the rec- 
ognition of relationship of more or less uni- 
versal applicability. Therefore, abilities should 
be developed: 


a. To construct formulas based on word 
statement. 
. To construct formulas based on tables. 
. To evaluate formulas. 
. To transform or change the subject of 
formulas. 


5. The study of equations involves the rec- 
ognition of relationships presented in a spe- 
cific situation. Therefore, abilities should be 
developed: 


a. To make simple equations expressing a 
particular relationship. 

b. To generalize an equation into a 
formula applicable to all problems in 
which the same relationships. hold. ° 

. To use a formula as an equation. 

. To solve equations of the first degree. 

. To solve pairs of equations of the first 
degree. 

. To solve fractional equations. 

. To solve quadratic equations. 
. To solve simp |e radical equations as 
met with in certain formulas. 


Specific abilities not implied in the fore- 
going enumeration but included in the content 
are: 


1. Continuing study and use of arithmetic 
skills. 

2. Direct and inverse proportion. 

3. The four fundamental operations ap- 
plied to 
a. Positive and negative numbers. 
b. Simple monomials. 
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. The addition and subtraction of simple 
polynomials. 
. The multiplication of binomials. 
. The division of a polynomial by a 
monomial. 
. The factoring of 
a. A polynomial containing a common 
monomial factor. 
b. The difference of two squares. 
. Arithmetic square root. 
. The use of the formula expressing the 
Pythagorean relation. 
. The use of the tangent, sine, and 
cosine relations. 
11. Scale drawing. 
12. Simple geometric constructions. 


A series of twenty-four problems was 
planned and put into use. These problems 
constituted the core of the course and pro- 
vided the basic learning experiences for all 
pupils. A second series of problems, ten in 
number, was developed in the course of the 
year, in order to stimulate independent effort. 
These problems were given to pupils who vol- 
unteered to solve them. Pupils received them 
on Mondays and returned the finished work 
a week later, when a new group undertook 
the problems. The problems dealt in part with 
subjects of current interest. Some were sug- 
gested by pupils. All were simple in nature, 
the data either given in easily readable form 
or easily found in other sources, and involved 
simple operational skills already acquired. 
The problems studied by the entire class and 
constituting the core work of the year were: 


PROBLEM 


What is the Federal Wage-Hour Law? 
How does the Federal Income Tax 
work? 

When does water boil? 

How does water cool? 

What should I know about consumer 
credit? 

Examining evidence: Lost civilization. 
What are the six simple machines? 
How do Fahrenheit and Centigrade 
temperatures compare? 

Examining evidence: What is my opin- 
ion of the Townsend Plan? 

How are parcel post costs computed? 
How can algebra be used in making 
mixtures? 
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Problem 

Number 

12 How can algebra be used to solve 
work-time problems? 

13 How can algebra save time in solving 
certain geometry problems (perimeters, 
areas, volumes) ? 

14 What are some elementary construc. 
tions used by draftsmen and layout 
men? 

15 Examining evidence: Shall the town 
of Hudson, Wisconsin, keep its toll 
bridge? 

16 Mathematics in building construction. 
a. How is the strength of floor beams 

computed? 

How are stairs constructed for easy 

climbing? 

How does a builder determine the 

pitch of a roof and the length of 

rafters? 

19 How is the airman’s ceiling measured? 

20 What is the mathematical basis oj 
aerial bombing? 

21 How can we determine the height oi 
an object at any second of its flight 
when thrown vertically in the air? 
What is the greatest distance an avi- 
ator can see when flying over sea or 
level ground? 

23. Examining evidence: Shall the calen- 
dar be changed? 

24 What happens to the trapezoid formula 
when certain changes are made? 


Ob. 


3 «c. 


Time consumed in the study of each prob- 
lem varied from a single period to five weeks. 
The average length of time per problem 
approximated two weeks. 


THE APPRAISAL 


The problem of evaluation would be made 
gratifyingly simple, if the outcomes of a study 
such as this one could be measured entirely 
by paper-and-pencil means. Statistical analy- 
sis of the results would yield evidence com- 
monly accepted as irrefutable, so far as the 
particular group under consideration is con- 
cerned, and from which larger implications 
might be inferred. This study is concerned 
not only with the acquisition of skills; it is 
also concerned with the development of a 
thinking pattern designed to affect the be- 
havior of the individual. Ability to use a pat- 
tern of thinking conceivably may be measured 
by paper-and-pencil testing, but the end re- 
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sult, its effect on behavior, is a matter of 
observation. Evaluation here, therefore, is in 
terms of analysis of test results, and analysis 
of pupil reactions as observed from various 
points of view. 

1. The Columbia Research Bureau Algebra 
Test, Test 2, Form B, was selected to measure 
the skills acquired by the pupils of Group A 
and Group B in the handling of the table, the 
formula, the graph, and the equation; it also 
measured pupils’ ability to analyze and solve 
the traditional verbal problem. Although the 
test fell short of being an exhaustive measure 
of these skills and abilities, it served well to 
compare achievement of the two classes with 
test standards. Further, the measurement of 
ability in verbal problem analysis and solu- 
tion was of value from the point of view of 
indicating relationship between verbal prob- 
lem solving and the larger process of problem 
solving with which the study is concerned. 
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A showed greater variability on both parts 
of the test than did Group B. The raw scores 
show, however, that no pupil in Group B ex- 
ceeded the highest score made in Group A on 
Part I, while seven pupils of Group A ex- 
ceeded the highest score made in Group B on 
Part II. Three pupils of Group A made higher 
scores than the highest score in Group B on 
the total test. 

In testing the significance of these differ- 
ences, the critical ratio and the t-value accord- 
ing to Guilford*® were computed. While these 
statistical measures are somewhat misused 
(the two groups were not matched), the ob- 
tained differences were substantiated by the 
use of Lindquist’s procedure in the analysis 
of a simple methods experiment.’® All three 
measures indicate that the mean difference on 
Part I of the test was significant at the one 
per cent level in favor of Group B; the mean 
difference on Part II was significant at the 


TABLE III 


AVERAGE SCORES AND SIGNIFICANCE OF DIFFERENCES ON COLUMBIA RESEARCH BUREAU ALGEBRA 
2B, For Groups A AND B 


Range: Group A 
Median: Group A 


Norm (Median) 
Mean: Group A 


I ca sn ke pales ce ce lg adh te oe eae ace 


Mean Difference 
Significance: 


. Standard deviations are given with means. 
» Standard errors are given with differences. 


Table III presents the range, median, and 
mean scores of Group A and Group B on Part 
I (Mechanics, possible score 22), Part II 
(Problems, possible score 50), and the total 
test. Group B did substantially better than 
Group A on Part I, exceeding the norm by 
more than one point, while Group A fell short 
of the norm by approximately the same mar- 
gin. On Part II the condition was reversed, 
Group A exceeding the norm by more than 
one point, and Group B falling short by more 
than two. The effect of achievement on Part 
II is reflected in the medians for the total 
test, Group A just reaching the norm, and 
Group B failing to do so by one point. Group 


Part I Part II 


0—17 


Total 
38—29 
6—24 


16.0 
15.0 


16 


16.0 +6.3 
14.2 4.0 


1.8+1.2 


1. 50 
P>.05 
P>.05 


one per cent level in favor of Group A; the 
mean difference on the total test was not sig- 
nificant, since such difference could occur at 
least once in twenty samples. 

So far as the results of the Columbia Re- 
search Bureau Test indicate, it would appear 
that Group A had developed, through exercise 
in the problem-solving process, a slightly 
better than normal (though variable) ability 
to solve verbal problems, at somewhat of a 
sacrifice of ability to handle the mechanics as 
covered by the test. 

15 J. P. Guilford, Fund s in Psychology and 
Siesta, pp. 135-137. New York: McGraw-Hill Book Co., 


%E. F. Lindquist, Statistical Analysis in Educational Re- 
search, pp. 100-104. Boston: Houghton—Mifflin Co., 1940. 


etal Staticti 
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2. Objective measurement of abilities more 
closely associated with the problem-solving 
process was highly desirable. “Abilities” is 
used here advisedly, for the process involves 
not one ability but a composite of abilities 
that might be separable for purposes of test- 
ing. Among them would be certainly the abil- 
ity to recognize a problem situation, the 
ability to analyze evidence, accepting that 
which leads to acceptable solution and reject- 
ing that which does not, the ability to select 
or plan a method of solution, and the ability 
to perform the solution. 

Standard objective tests of this nature do 
not exist. It was necessary, therefore, to con- 
struct one that would measure in part the 
abilities mentioned. Five problem situations 
of non-mathematical nature were selected and 
developed as a measure of two aspects of the 
problem-solving process; namely, recognition 
of a problem situation, and interpretation of 
evidence. This test was given Group A and 
Group B on the second and third days after 
the opening of school. It was revised for use 
as a final test in June, in order to consume 
only a single period of class time. This was 
done by changing the material of two situa- 
tions and revising several sections of the 
others. 

The reliability and validity of the pretest 
and final test were, of course, unknown. In 
order to arrive at a measure of reliability, a 
product-moment correlation coefficient was 
calculated between the odd-numbered and 
even-numbered items of each test for the 
combined groups. A coefficient of .56 + .04 
(PE) was indicated for the pretest, and of 
.48 + .06 for the final test. These coefficients 
leave something to be desired as to the con- 
stancy with which the two instruments 
measure what they purport to measure, only 
partly compensated for when the Spearman— 
Brown formula for predicting a test of 
doubled length was applied. Under this pre- 
diction the pretest (of sixty-seven items) 
bore a reliability of .71, and the final test (of 
sixty-five items) a coefficient of .65.*7 

Establishment of the validity of these 
“straight thinking” tests was not attempted 
by computational methods, for the reason that 
there is no singleness of meaning of the ex- 
pression “straight thinking”; it appeared 
therefore useless to attempt to establish cri- 
teria of validity in terms of abilities as the 


273. P. Guilford, of. cit., p. 219, suggests a reliability 
coefficient of above .80 for group predictions. 
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factor-analysis approach would necessitate 
For this reason it was decided to secure 
authoritative judgment by way of validation, 
Copies of the final test were sent to twelve 
individuals working in the field of mathe. 
matics whose judgment could be reliably con- 
sidered as expert. With the tests was sent a 
list of five statements or questions upon which 
comment was invited. Returns were made by 
ten persons, among whom were six professor; 
of mathematics and education in universities 
and teachers colleges, three supervisors of 
mathematics in public schools, and one junior 
high school teacher. Analysis of their re. 
sponses did not indicate full agreement on all 
points raised, but the number of responses 
expressing both full and qualified agreement 
were in all cases decidedly in the majority. 

The medians and means of scores on the 
pretest for the two groups (mean difference 
3.2, Table IV) indicated a difference in favor 
of Group B. Tests of significance showed the 
difference to be not significant, with a prob- 
ability above the five per cent level. The very 
slight difference in favor of Group A in the 
final test (mean difference 0.9) was certainly 
not significant. Group A, however, showed 
greater increase in ability to handle materials 
of the type with which the tests dealt. The 
mean difference of 4.5 between pretest and 
final test scores for Group A was just signifi- 
cant at the one per cent level, while the mean 
difference of 0.4 between pretest and final test 
scores for Group B was not significant. Lind- 
quist’s analysis of covariance in a simple 
methods experiment** was applied to the pre- 
test and final test scores of both groups, 
which yielded a less decided significance, 
probability falling between the one ; er cent 
and five per cent levels. 


It is possible that, had more nearly perfect 
instruments been used, the obtained differ- 
ences may have shown greater significance. 
At any rate, these results bear some indication 
that ninth-grade pupils of average general 
ability can realize a significant gain in ability 
to analyze materials relating to recognition of 
problem situations and in analysis and inter- 
pretation of evidence associated with such 
situations, if instruction is so directed. 

It was considered worthwhile to know 
something of the association between scores 
on the final test and scores on Part II 
(Problems) of the algebra test. The correla- 

#E. F. Lindquist, op. cit., pp. 191-195. 
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tion in the case of Group A was .64 + .06, 
and for Group B, .33 + .10. This rather 
startling difference probably indicates that 
the factors operating for greater success in 
handling materials of the type illustrated in 
the final test and in the solution of verbal 
algebra problems are similar. One of these 
factors might well be reading ability. Cer- 
tainly the pupils of Group A had more ex- 
perience in meaningful reading than had 
Group B. A second factor might be the de- 
velopment of the habit of critical analysis and 
evaluation called for in the problem-solving 
process. 


TABLE IV 


AVERAGE SCORES AND SIGNIFICANCE OF DIFFER- 
ENCES ON PRETEST AND FINAL TEST OF 
“STRAIGHT THINKING” For GROUPS 
A AND B 


Pretest 


25—54 
29—67 


40.7 
41.5 


39.5+6.4 
42.7+9.0 
—3.2+1.8 


Final 
31—58 
22—56 

44.7 

44.5 


44.0+7.3 
43.1+8.1 


0.9+1.8 


Range: Group A 
Group B 


Median: Geomp A 


Mean:GroupA 
Gap s....... 


Mean difference 


. 06 
P>.05 
P>.05 


Pretest-Final 
T 


Mean Difference: Group A 
Significance: 


Mean Difference:GroupB -__-. 
Significance: 

Critical Ratio 

t-ratio_ _ 
Covariance 


3. Comments by pupils on the values of 
the course as they saw them were invited. 
On the day of the last class meeting, pupils 
of Group A were asked to write brief com- 
ments on ““What have you found in this year’s 
mathematics different from the mathematics 
of other years? What benefit do you believe 
it has been to you?” The papers were re- 
turned unsigned, in order that freedom of 
expression might be exercised. 


Every pupil in the class made comment. 
The papers were examined for specific evi- 
dence of values that pupils felt they had 
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gained. Nearly half the papers contained ex- 
pressions only to the effect that the pupils 
“liked math better” or “understood it better” 
than in other years. About a third gave more 
particular reasons for their greater liking or 
better understanding. Mentioned in these 
papers were the three problem _ types; 
formulas, graphs, and equations; looking up 
references; independent work on the special 
problems; generalizing; and application to 
everyday life. 

Surprisingly, only seven pupils expressed 
criticism of any kind, and only one of these 
stated bluntly, “I didn’t get anything out of 
math. I don’t like it.” Others qualified their 
dislike by such statements as “I never liked 
math”, and “I never passed math”. Yet even 
these individuals conceded that “some of the 
problems were interesting”. Apparently more 
than a single year of a differently oriented 
mathematics is needed to affect deep-rooted 
negativism. 

4. The preceding paragraphs have indi- 
cated to some extent the changed interests of 
pupils in mathematics as a result of their 
ninth-grade course of study. A more definite 
idea of this change is given in Table V. When 
the pretest was given early in September, the 
pupils of both groups were asked about their 
present interests in mathematics—whether it 
was the subject they liked best or least of 
their studies, or whether their attitude was 
just “so-so”. At the time of the final test in 
June the same questions were asked. The 
positive change in interest in Group A, and 
the negative change in Group B, are signifi- 
cant perhaps of nothing more than the vola- 
tility of pupil interest in ninth-grade age 
groups. Yet it is on the basis of such interests 
that pupils make important decisions affect- 
ing their future study plans. 

Pupils of both groups were also asked in 
September whether they would select mathe- 
matics as a study in senior high school, if it 
were not a required subject. At that time they 
were not generally aware that mathematics 
was a requirement in few of the senior high 
school curriculums, elective in most. Fourteen 
pupils of Group A and eighteen pupils of 
Group B indicated a mathematics preference. 
In early spring they became acquainted with 
these curriculums under the guidance of their 
home room. teachers (neither of whom was 
their mathematics instructor). Of the four- 
teen Group A pupils, twelve selected a cur- 





44 JOURNAL OF EXPERIMENTAL EDUCATION 


riculum requiring mathematics, or elected it 
in a curriculum not requiring it; of the 
eighteen Group B pupils, eight selected math- 
ematics on a required or elective basis. 


TABLE V 


EXPRESSED INTEREST IN MATHEMATICS OF 
PUPILS IN GROUPS A AND B 


Group Group 
A B 


When asked in September: 
Of your studies, do you 
like mathematics best? 
Neither like nor dislike 

When asked in June: 

Do you like mathematics 


When asked in September: 
Would you take math. in 
senior high school if it were 
not a required course? 


Number actually selecting 

mathematics for tenth grade - 
When asked in June: 

Did you like ninth grade math. 

a than seventh-eighth 


5. Group A was selected originally for par- 
ticipation in the new course of study, because 
it was believed that few, if any, would con- 
tinue mathematics beyond ninth grade. The 
boys were in the commercial curriculum; if 
they continued in one of the commercial cur- 
riculums in tenth grade, no mathematics 
would be required. The curriculum of the girls 
was the general program; probably none 
would select a curriculum requiring mathe- 
matics or would elect it. It was, therefore, 
somewhat of a surprise to find that, in the 
spring, the majority of the boys changed 
from their commercial interests, and eleven 
of them placed mathematics in their tenth- 
grade program. One girl selected mathematics. 
This presented an opportunity for testing the 
new course beyond its primary purpose: Was 
it a course with an “open end”, permitting 
pupils to go on into secondary mathematics 
without penalty, if their interests so dictated? 

Tenth-grade mathematics offerings were 
three: second year algebra, industrial mathe- 
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matics (a course essentially algebra but ip. 
cluding arithmetic fundamentals), and cop. 
sumer mathematics (a general social mathe 
matics). Ten pupils of Group A and two g 
Group B selected the algebra. One of Grow 
A and six of Group B selected industria 
mathematics. One of Group A and none o 
Group B selected consumer mathematics. Siz 
of Group A selected mathematics on a fre 
elective basis, and two of Group B did so. 

In order to measure, ina general way, the 
ninth-grade course of these pupils, a brie 
case study record was compiled for each 
pupil, including such items as _ intelligence 
quotient, Standard Algebra Test score, and 
final “straight thinking” test score. An esti- 
mate of the probable success of each pupil in 
tenth-grade mathematics was made, and this 
was compared with a report from the tenth- 
grade instructor at the end of the year. 

It was found at that time (no previous con. 
tact had been made during the year) that 
two of the twelve pupils had never entered 
the high school, and that three more had left 
during the first nine weeks of school. Of the 
seven who completed the year’s work, two did 
average work, three did below average but 
passing work, and two (algebra students) 
failed the course. These reports were entirely 
in agreement with the estimates of probable 
success previously made. Those who did a 
reasonably good quality of work in the ninth- 
grade course generally did passing or better 
work in the tenth grade. Those who failed the 
tenth-grade course most certainly would have 
been failures in a ninth-grade algebra course. 

Shown here is the need for making of 
ninth-grade mathematics a course possessing 
its own values aside from its propaedeutic 
worth, without, however, neglecting the latter. 
For the pupils who left school in tenth grade, 
ninth-grade mathematics was truly terminal, 
as it was for the two-thirds of Group A who 
took no further mathematics. Certainly their 
ninth-grade course more nearly met their 
needs than would have a course of study 
whose main purpose is to prepare for the suc- 
ceeding year’s work. 

This touches the very heart of the problem 
of ninth-grade mathematics, for this course 
stands guard at the crossroad; behind it lies 
the general mathematics knowledge of the 
lower years of which all partake; before it 
lies the specialized secondary mathematics 
which has been until now, and still is, the 
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province of the specialist and the superior 
pupil. Ninth-grade mathematics should not 
be the weeding-out ground for succeeding 
years. It should be an area of experience that 
instills new interest in pupils’ minds, to stimu- 
late desire for further experience in new areas. 
That such interest and stimulation are pos- 
sible has been demonstrated here. By such 
means may ninth-grade mathematics do its 
fair part in meeting the criticism that, as the 
need for mathematical knowledge and think- 
ing grows in the modern world, student in- 
terest in mathematics decreases. It then be- 
comes the responsibility of the secondary 
school to maintain that interest, and to pro- 
vide mathematical training and experience 
commensurate with the abilities of the pupils. 


CONCLUSIONS 


It is recognized that there are obvious 
limitations regarding the conclusions that 
may possibly be drawn from the results pre- 
sented here. Appraisal of the proposed objec- 
tives was in terms of a single group of pupils 
in a single school. The pupils concerned were 
pupils of average general ability in this 
school; in another school the average pupil 
may be a quite different individual. Standards 
of achievement and quality of teaching vary 
among teachers and among schools. In view 
of these and other considerations a large de- 
gree of circumspection must be exercised in 
attempting to make generalizations carrying 
universal applicability. 

Certain general conclusions may probably 
be reached safely on the basis of the evidence. 

1. In the case of pupiis of average general 
ability we reap what we sow in mathematics 
instruction. If the learning of skills is em- 
phasized, pupils show average ability in pure 
skill, but less than average power to apply 
the skills. Conversely, if instruction is directed 
toward growth in power, some sacrifice is 
made in skill. It probably follows, therefore, 
that for the average ninth-grade pupil there 
should be sought such a kind and quantity 
of skills for his instruction that he may de- 
velop optimum skill and optimum power to 
apply his skill. 

2. It is unfortunate that the pattern of 
mathematics instruction has been consistently 
along the lines of skill learning and skill 
application. The average pupil apparently 
emerges from such a course of instruction 
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with little else than a mass of acquired skills. 
On the other hand, if learning is directed to- 
ward understanding of a situation beyond its 
pure skill aspects, there is growth in power 
to use the skills acquired, and the pupil ex- 
periences a satisfaction in meaningful learn- 
ing not present under the traditional pattern 
of instruction. 

3. No direct evidence has been presented 
to show that the.step-by-step technique of 
problem solving had become a part of the 
pupil’s thinking pattern, that experience with 
the problem-solving process had resulted in 
desirable changed behavior, or that the pupil 
had learned habitually to think of his prob- 
lems as falling into categories of problem 
types. A reason for this omission is the diffi- 
culty involved in the securing of such evi- 
dence; outcomes associated with thinking- 
and behavior-patterns do not lend themselves 
to paper-and-pencil testing. At the same time 
there is no reason to believe that the increased 
interest of pupils has not led to desirable 
change. Certainly the way to realization of 
such values is through stimulation of pupil 
interest. That appreciable stimulation is pos- 
sible with the average ninth-grade pupil has 
been amply demonstrated. Lack of adequate 
means of measurement does not preclude the 
possibility of achieving desirable objectives, 
nor should this lack deter efforts toward 
realization of such objectives. 

4. There is evidence that a substantial 
relationship exists between ability to solve 
verbal algebra problems and ability to analyze 
materials of a non-mathematical nature, in 
so far as the test on “straight thinking” 
measured the relationship. It is possible that 
inclusion of non-mathematical materials in a 
first course in algebra may result in increased 
ability to handle verbal problems. 

5. Pupil ability to analyze materials of 
non-mathematical nature can be improved, if 
instruction is directed to that end. 

6. There is evidence that pupils given the 
experience offered by a course of study that 
emphasizes values besides skill can realize a 
measure of success in later sequentially 
ordered mathematics in keeping with their 
general learning ability. 

7. There is a possibility that values can be 
so emphasized in ninth-grade mathematics 
instruction as to create greater interest in 
and desire to continue mathematics study. 





PUPIL INFERENCE~—VARIETY, DEPTH, 
AND DIRECTION OF ERROR 


CoNWELL DEAN HIccIns 
Albany High School, New York 


BACKGROUND 


Three measures of inductive ability were 
used in an experiment’ to determine the effect 
of training pupils in inductive methods. The 
tests used were concerned with these aspects 
of inductive reasoning: (1) making original 
inferences from biologic data, (2) classifying 
inferences made by others in categories, and 
(3) identifying patterns presented in non- 
verbal material. The present report is re- 
stricted to a general consideration of original 
inferences made by pupils. 


The inductive ability of children has been 
studied by two methods: (1) the interview 
or “clinical” method, and (2) the group 
method. 

The interview method is one in which the 
experimenter seeks to learn how the child 
reasons by having the child verbally explain 
phenomena or experiments. The investigator 
is not confined to one question, but may de- 
rive a variety of responses by a series of 
questions. Piaget used the interview method 
extensively? with children whose ages ranged 
from three years through eleven years. 


The group method is one in which the in- 
vestigator presents his subjects with an 
experience (demonstration, experiment, or 
experimental data) and the individuals re- 
spond by writing an explanation or conclu- 
sion. One advantage of the group method 
over the interview method is that the sample 
may be larger. The group method, however, 
can be used only with children who are able 
to write. 


A criticism of the group method of eliciting 
concepts of causal relations is that the inves- 
tigator “accepts the first ready answer of the 
child as representing his best causal thinking 

2Conwell Dean Higgins, ‘“‘Educability of Adolescents in 
Inductive Ability”. Submitted in partial fulfillment of the 


uirements for the degree of Doctor of Philosophy in the 
School of Education of New York University, 1942. 


2 Jean Piaget, The Child’s 7. of the World. New 
York: Harcourt, Brace and Co., 1929. 


Jean Piaget, Judgment and eis in the Child. New 
York: Harcourt, Brace and Co., 1928. 
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and his actual concept.”* A counter criticism 
of the interview method is that possibly “the 
child is pressed to explain what he means by 
each statement . . . and that because they 
feel obliged to answer the questions, they 
invent an answer which will help them out of 
their difficulty.”* 

Piaget categorized children’s explanations 
of causality into seventeen types® and indi- 
cated that the child’s growth in ability to 
infer causality was definitely related to 
maturation.® 

The conclusions of Piaget were not sup- 
ported by the investigation’ of Huang, who 
presented demonstrations in the form of con- 
juror’s tricks, simple scientific experiments, 
and illusions to twenty-seven primary chil- 
dren. The children’s explanations of the 
occurrences were obtained by the interview 
method and “were, with rare and mostly 
equivocal exceptions, definitely naturalistic. 
They were physical concepts of a ‘simple 
variety’ but not psychological, finalistic, 
magical, moral, animistic, artificialistic or 
mystic.’’* Using many of the questions utilized 
by Piaget, Oakes® studied the responses o/ 
children as well as adults, and concluded that 
the evidence failed to support Piaget’s finding 
of definite stages in conceptual growth. 

PupIL INFERENCE 

In eliciting original inferences, nine rather 
simple experiments were presented to the 
pupils, who were directed to write a conclu- 
sion based exclusively on the data of each 
experiment. The conclusions were evaluated 
in terms of three categories: (1) complete- 
ness, (2) beyond the data, and (3) falsity. 


3 Jean Marquis Deutche, The Development of Children’: 
Concepts of Causal Relations, p. 92. Monograph No. 13 
Institute of Child Welfare. Minneapolis: University of Minne- 
sota Press, 1937 

* Loc. cit. 

5 Jean Piaget, The Child’s Conception of Physical Causality, 
pp. 258-56. New York: Harcourt, Brace and Co., 1930 

6 Jhid., pp. 267-73. 

71. Huang, Children’s Explanations of Strange Phenomene 
Smith College Studies in Psychology, I. Northampton, Mass. 
Smith ot 1930. 

8 Ibid., p. 115 . 

® Mervin E. Oakes, ‘How Do Children - Things?” 
Science Education, 26 (February, 1942), 61-6 





ticism 
y “the 
ins by 
» they 

they 
out of 


ations 
| indi- 
ity to 
od to 


t sup- 
, who 
f con- 
ments, 
r chil- 
f the 
erview 
mostly 
alistic 
simple 
alistic, 
tic or 
tilized 
ses of 
d that 
inding 


rather 

oO the 
‘onclu- 
f each 
Juated 
nplete- 
sity. 


No. 13 
f Minne- 


~ausality, 
930. 


enomenc 
1, Mass. 


Things?” 


September, 1944] 


Some insight in the causal thinking of tenth 
year biology pupils may be possible through 
the study of pupil responses. The pupil re- 
sponses presented in this report represent a 
very small sample of the 4,320 responses 
obtained from the experiment. In the selection 
of the responses, no attempt was made to re- 
flect the statistical comparisons made between 
the initial and final conclusions of either the 
control or the experimental samples. Hence, 
random sampling was not made. The re- 
sponses represent a subjective selection on the 
part of the investigator. They were chosen to 
show qualitative differences in inductive 
reasoning. 

In presenting the responses,’® the initial 
conclusions are designated by the symbol I, 
and the final conclusions by the symbol F. 
Thirty school weeks intervened between the 
initial and final conclusions. The comments 
relating to the conclusions are not intended 
to be exhaustive, but rather to point out 
salient features of the responses. 


ITEM I 


In a certain experiment egg plants were 
grown in soil containing varying degrees of 
moisture. One hundred plants were grown in 
soil which was 45% saturated with moisture. 
The temperature of the soil was kept at 28 
degrees C. At the end of 25 days, the average 
dry weight of the plants was 5.6 grams. 
Another 100 egg plants were grown in similar 
soil which was saturated with 65% moisture. 
The temperature was kept at the 28 degrees 
C. At the end of 25 days the average dry 
weight of the plants was 12.1 grams. A third 
group of 100 plants was grown in similar soil 
which was 85% saturated with moisture. The 
temperature of the soil was kept at 28 de- 
grees C, At the end of 25 days the average 
dry weight of the plants was 14.2 grams. A 
fourth group of 100 plants was grown in 
similar soil which was 95% saturated with 
moisture. The temperature of the soil was 
kept at 28 degrees C. At the end of 25 days, 
the average dry weight of the plants was 19.1 
grams. 


Selected Responses on Item One * 
Purr A 


(I) I conclude that the (F) Conclusion: The 
less saturated the soil is higher the temperature 
the less weight in mois- of the soil the better 
ture. the soil will grow. 

The responses have not been edited. 
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The conclusions do not present evidence 
that the verbal material was understood. It 
may be that, if the pupil had done the experi- 
ment, he would have been able to arrive at a 
satisfactory conclusion. This illustrates the 
fact that the Direct Conclusion Test is not 
a pure inductive measure, but probably 
measures reading ability, as well as other 
abilities. 

The conclusions of this pupil were scored 
as 0-0-0, indicating that false statements and 
rash conclusions imply some understanding of 
the experiment. 


Puri B 


(I) Conclusion: All 
egg plants which were 
grown in soil containing 
varying degrees of mois- 
ture had all different 
grams or ended up with 
different number of 
grams. 


(F) Conclusion: egg 
plant were grown in soil 
containing varying de- 
grees of moisture. At 
the end of 25 days each 
plant the average weight 
of each plants were all 
different. 


The pupil’s response suggests that there 
was an awareness of concomitant variation of 
plant weight and per cent-of soil moisture. 
Although the degree of insight was small, it 
indicates the pupil’s ability to do more than 
repeat the data. 


Purr C 
(1) While I was read- (F) I conclude that 


ing the experiment the 
temperature of the eggs 
in different sections were 
getting hotter and the 


the egg plants grew bet- 
ter in the 95% saturated 
with moisture at 28 de- 
grees C. 


grams were getting 
higher. 


Initially, the pupil erred in stating that 
eggs grew hotter. Finally, the pupil correctly 
referred to egg plants, but failed to restrict 


the comparison to dry weight, hence the con- 
clusion was scored as being beyond the data. 


Purm D 


(I) We conclude that 
the more moisture a 
Plant has the heavier 
and healthier it will be. 


(F) I conclude that 
the Plants saturated with 
the most water (95%) 
at the end of 25 days 
weighed 19.1 grams 
almost 4 times as much’* 
as the plant grow in 
45% water. The more 
water in the soil the 
heavier the Plants get. 


The initial conclusions went beyond the 
data in three respects: failure to recognize 
that the highest amount of soil moisture 
tested was 95%, failure to restrict the infer- 
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ence to egg plants, and inclusion of plant 
health as a basis of comparison. 

The final conclusion restricts the compari- 
son to weight. Otherwise the conclusion is 
similar to the first inference, in spite of the 
fact that the pupil correctly compares the 
weight of plants grown at 45 and 95 per cent 
moisture saturation. 


Purr E 


(I) I say the more 
heat and moisture a 
plant gets the more it 
grows. If 100 plants was 
45% saturated with 
moisture at 28 degrees 
C. and at the end of 25 
days weight 5.6 grams. 
And 100 plants saturated 
95% moisture at 28 de- 
grees C. at the end of 25 
days weighed 19.1. There 
is a difference in weight 
by 13.5 grams. 


(F) I conclude that 
the more the soil is 
saturated with moisture 
the more the egg plant 
grew. because in the first 
experiment at 45% satu- 
rated with moisture and 
at the end of 25 days, 
dry weight was 5.6 
grams. the last experi- 
ment 95% saturated 
with moisture and at the 
end of 25 days weight 
19.1 grams. 


The final conclusion was an improvement, 
in that heat was ruled out as an experimental 
factor and the statement was restricted to 


egg plants. 


Purr F 


(I) I therefore con- 
clude the best soil for 
growing egg plants is 
45% saturated with 
water, 28 degrees C. 
temperature of the soil 
the plants were in a bet- 
ter condition after using 
this soil. 


(F) I therefore con- 
clude that egg plants 
need a great deal of 
moisture and grow best 
in soil saturated with 
95% moisture at 28 de- 
grees C 
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Purm H 


(I) As the moisture 
was increased the weight 
of the soil increased. 
This proves that the 
more moisture a soil has 
the more weight plants 
will have showing plant 
will take all the water 
they can get. 


plant heavier regardles 
of temp. 


Both conclusions went beyond the data. |j 
the pupil had hypothesized that “water prob. 
ably has some food and tissue building 
factor”, the statement would be judged o 
the basis of completeness rather than beyond 


the data. 


Purw I 


(I) Conclusion that 
the egg plants increase in 
weight according to amt. 
of moisture obtained 
during growing season. 
Or results after dried 
plants are those with 
greatest amt. of moisture 
during growing season 
weigh more when at full 
growth and dried. 


(F) The more mois 
ture the more the eg 
plants. 


The second portion of the initial conclv 
sion is obscure. In comparing the final and 
initial conclusion, it is unlikely that the pupil 
would have regressed to this extent. This 
illustrates one drawback of the group method 


in collecting inferences. 


Puri J 


(I) The more water 
that was put into the 
ground the higher the 
percentage of average 


(F) The more water 
there is in the soil the 
more weight the eg 
plants have. The longer 


The first inference was false, while in the 
second inference “grow best” was substituted 
for dry weight of egg plants. The data do not 
necessarily indicate that a saturation of 95 
per cent moisture is the optimum. 


Purr G 


(I) This shows that 
the more saturated the 
soil is the higher the 
plants grow. The less 
saturated the surface, the 
plant doesn’t grow as 
fast. 


(F) This experiment 
shows that the more 
moisture in the soil kept 
at 28 degrees C. the 
more the egg plants 
weighed at the end of 
25 days. 


Height of plant and rate of growth were 
used for comparison instead of dry weight in 
the first conclusion. The final statement 
failed to indicate the upper limit of soil 
moisture. 


dry weight. the time they are under 
these conditions the more 


weight they develope. 


The initial conclusion was too incomplete 
to make an appraisal, while in the final con- 
clusion growing time was interpreted as being 
an experimental factor. 


Pupit K 

(I) Egg plants grow (F) I conclude that 
better in soil which is the more the soil was 
moist. The more mois- saturated with water, the 
ture the plant receives, greater weight plants 
the healthier it becomes were produced, since the 
and grows. Moisture is temperature was always 
good for plants. constant. 


The final conclusion was an improvement, 
in that the comparison was restricted to 
weight of plants. Inasmuch as the statement 
was in the past tense, it was assumed by the 
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investigator that the pupil referred to egg 
plants. Where a statement of this kind was 
made in the present tense, it was assumed 
that the pupil referred to any plant and was 
judged as being beyond the data. 


ITEM 2 


A farmer has a beanfield which is infested 
with injurious insects and is in danger of 
being destroyed. All parts of the plants are 
dusted with arsenic poison (stomach poison), 
sufficiently strong to kill any insect if the 
poison is eaten. The insects continue to feed 
on the plants and are not killed. 


Selected Responses on Item Two 
Puri L 
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The data do not justify the statement that 
insects have a tolerance for small amounts of 
poison. 


Purm O 

(I) I concluded that (F) I concluded that 
some insects will not die the insects antennae told 
if they are poisoned. them something was 
wrong with the plant, 
so the insects ate the 
parts of the plant that 

were not dusted. 


The initial conclusion was contradicted by 
the data. In the final conclusion, reference to 
the antennae as sense organs was held to go 
beyond the data. Had the statement been 
qualified, the conclusion would have been 
evaluated under the category of completeness. 
The use of the verb “told” should be noted. 


(I) When the injuri- 
ous insects eat upon the 
plants that have arsenic 
Poison (bean) Fields 
they are killed and the 
bean field is left to grow 
without any danger of 
injurious insects bother- 
ing them. 


(F) This shows that 
the farmer’s bean field 
was not destroyed of in- 
sects for he possible did 
not put enough or suffi- 
cient amount of Arsenic 
Poison to kill the in- 
jurious insects. 


Both the initial and final conclusions are 
contradicted by the data. 


Puri M 


(I) He is trying to 
save his crops, so the 
insects won’t kill them. 


(F) The conclusion is 
that they should destroy 
the bean field and plants 
another thing. 


The pupil evidently failed to gain an under- 


Purit P 


(I) My conclusion is 
that the moisture and air 
around the plant ab- 
sorbes the poison and 
therefore does not harm 
the insect. 


(F) I think that per- 
haps the soil in the 
ground and the oxygen 
in the air absorbes most 
of the poison. 


The terminal response represents an ad- 
vance in that the conclusion was qualified by 


“perhaps”. 


Purr Q 


(1) Conclusion: I con- 
clude that if arsenic poi- 
son is dusted on bean 
plants they may be 
saved from distruction. 


(F) Arsenic poison is 
not injurious to insects 
that feed on beans. 


standing of the problem in his first attempt. 
The data of the experiment are insufficient to 
justify the final conclusion. 


Puri: N 


The improvement of induction in this in- 
stance was that in the final statement data 
are restated and the initial false assertion was 
omitted. 


(I) My conclusion is 
that the insects are either 
eating the stems near the 
ground or that they are 
injuring the roots of the 
plants. There could also 
be a certain amount of 
poison to be taken in 
order to kill the insects 
and they are eating just 
enough to damage. the 
plants, and being able to 
keep alive. 


(F) The insects may 
be able to feed on the 
plants and still live be- 
cause they may have a 
way of going into the 
ground and feeding on 
the roots of the plant, 
thereby not getting any 
of the poison. Another 
way is that the insect 
might have some form 
of needle or drill shape 
protrusion on its mouth 
which digs into the in- 
side of the stem of the 
plant and sucks out the 
contents within thereby 
not getting any of the 
poison. 


Purr R 


(I) The poisoens 
should be tried at vari- 
ous intersitus and if the 
insects still continue to 
eat a different poison 
should be applied. The 
insects probably are not 
harmed by the this 
poision. 


(F) I concled from 
facts given that either 
the insects are not eating 
the poison or they are 
not harmed by it. I 
would suggest the use of 
a poison or dust that 
would clog the speracles 
of these insects. 


The final conclusion probably reveals a 
knowledge of insect anatomy not held at the 
beginning of the experiment. In both in- 
stances the pupil recognized the ineffective- 
ness of arsenic and suggested the use of 
another poison. 
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Purr S$ 


(1) I conclude that 
the insects are not feed- 
ing on the bean plants 
or else the arsenic poi- 
son would have killed 
them. 


(F) I conclude that 
the insects either are 
immune to arsenic which 
is hardly possibly or 
they do not eat the out 
side but the inside of the 
plant where there is no 
arsenic. 


The first inference was false. 


Purr R 


(I) I therefore con- 
clude that the insects 
will not eat the arsenic 
poison when they still 
have beans to eat. The 
arsenic poison will kill 
them if they ate it but 
they prefer beans in- 
stead. 


(F) Although the 
plants are dusted with 
arsenic poison, the in- 
sects are not killed when 
they eat the plants. 


In both cases, zero values were assigned to 


the conclusions. 


Puri T 


(1) This must show 
that the insect could not 
digest his food with a 
stomach but some other 
way. 


(F) Conclusion the 
poison harms the 
stomach, these insects 
must have some other 
way of eating and 
digesting the plant, with- 
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Three inferences of the initial conclusion 
vary in value. It seems that the first hypothe. 
sis may imply the insects feed inside the 


plants. 


The final conclusion identified the insects 


as moths. 


Purr, W 


(I) Since the insects 
continue to feed on the 
plants and are not 
killed it is concluded 
that these injurious in- 
sects are allergic to the 
arsenic poison. 


(F) I therefore con. 
clude that these injur- 
ous insects were not 
affected by the arsenic 
poison and therefore did 
not die. 


The use of the term, “allergic’”’, indicates 
that the word was not understood. The final 
statement was held to be repetition of th 


data. 


Purm X 


(I) I therefore con- 
clude that dusting the 
plants with arsenic poi- 
soning is not good, since 
evidence shows that in- 
sects must have eaten 
the plants after powder 
had blown off. 


(F) I conclude that 
the insects are not killed 
by this poison becaus 
they do not eat the plant 
right away after it has 
been sprayed. When the 
insect does eat the plant 
the poison spray ha 
been washed away. 


out a stomach. 


The inference that the insects of the experi- 
ment have no stomachs may be due to the 
statement that arsenic is a stomach poison. 
This type of inference occurred several times. 


Puri, U 


(I) Probably the (F) This is true be- 


The essential difference in the conclusions 
is in the way the arsenic is supposed to have 
been removed, by the wind and by the rain. 


Puri Y 


stomach poison won’t 
take affect unless the 
farmer eats the beans 


cause the poison is prob- 
ably not powerful 
enough. 


(I) This shows the 
plants were not properly 
dusted with arsenic or 
the insects are not eat- 
ing the parts of the 
plants which were 
dusted with arsenic. 


(F) The insects must 
eat some part of the 
plant which can not be 
dusted. The farmer 
should pick the insects 
by hand. 


with the arsenic poison 
on it. When he does the 
insects will die. 

The first statement is difficult to charac- 
terize. Other responses of this pupil suggest 
that it may not be an attempt to be humorous. 
The final conclusion was judged to be false. 


Pupit V 


(I) The conclusion is 
that either the farmer 
has applied the arsenic 
too late and there are 
colonies already in the 
plant or he has not 
dusted the right plants. 
Then, too the poison 
may be defective. 


(F) I conclude that 
some outside influence is 
entering this problem, 
causing the moths to be 
immune to the poison: 
or else the poison is be- 
ing applied wrong to the 
plants. 


The final conclusion was judged as being 
complete; the pupil’s suggestion of insect 
control was not judged for effectiveness. 


ITEM 3 


Nine hundred seeds of a certain plant were 
divided into nine groups of 100 seeds each. 
Each group of 100 seeds was placed in a ger- 
minator. The seeds in all the germinators 
were under the same conditions of air and 
moisture and they were all kept in the dark. 
Each germinator, however, was kept at a 
different temperature. The various tempera- 
tures and the number of seeds which germi- 
nated within 30 days are shown in the follow- 
ing table: 


Septe 


Tempe 
Numb 
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mperature in degrees Centigrade 
alee of seeds which germinated 


Selected Responses on Item Three 
PuPiL a 


PUPIL INFERENCE st 


13 18 25 30 35 39 
0 16 50 84 30 0 


Pupil e 


(1) It shows the dif- 
ferents in temperature 
and what it has to do 
with the amount as 
number oi seed germi- 
nated. 


(F) 1 concluded that 
at 30 degrees more seed 
germinated then at 6 de- 
grees, 8 degrees 11 de- 
grees 13 degrees which 
none of the seeds ger- 
minated. 


The final conclusion demonstrates consider- 
able growth when compared with the first 


attempt. 


Puri b 


(1) The temperature 
from 18 degrees Centi- 
grade to 35 degrees Cen- 
tigrade the seed are Ger- 
minated. The higher the 
temperature is the more 
seeds are germinated. 


(F) The temperature, 
the light, the air, and 
moisture are controlled. 
When the temperature 
reaches 18 degrees centi- 
grade the seeds germi- 
nate. When the tempera- 
ture hits 30 Degrees C. 
The no. of seeds that 
germinated is reduced. 
At 39 degrees C. no seeds 
germinated. Thus the 


(1) I therefore con- 
clude that favorable con- 
ditions which help seeds 
to germinate are 30 de- 
grees C. dark, moisture 
and air. 


(F) I therefore con- 
clude that these seed 
germinated best at 30 
degrees centigrade under 
the correct conditions. 


The pupil did not distinguish between the 
experimental and the control factors in the 
beginning, but made the distinction in the 


last response. 


Pup f 


(I) It shows that 
more seeds will germi- 
nate in thirty in a tem- 
perature of 30 degrees 
centigrade than at any 
other temperature and 
moisture and air have 
little to do with it. 


(F) Control is good. 
Seeds kept equal except 
at the temperature. The 
seeds germinate best be- 
tween 25 degrees and 30 
degrees centigrade and 
therefore should be kept 
at that temperature if 
they are to germinate 
best. 


best temperature for seed 
germination is between 
18 degrees and 30 de- 
grees C. 

The terminal inference erred with respect 
to upper limjts of temperature for germina- 
tion and went beyond the data by referring 
to seeds rather than the particular seeds used 


in the experiment. 


PupiL c 


(1) The conclusion to 
experiment three as I de- 
duct is: that the lower, 
and the higher the tem- 
perature is the less the 
seeds germinate. But 
when the temperature is 
from 18-35 or mediam 
temperature the more 
the seeds germinate. 


(F) The conclusion 
that I draw is that if 
the seeds are under too 
low or too high temper- 
ature they will not ger- 
minate. But if kept in- 
betwens they will more 
affectitaley. This is be- 
cause if the light is to 
high there is to much 
energy, and if to low, 
not enough. 


The final response was incorrect, in that 
light was substituted for temperature. 


Puri d 


(I) I conclude that 
this kind of seed thrives 
better at 30 degrees C. 
than at any outher, 
which proves seeds can 
be cept to warn as well 
as to cold. 


(F) I conclude that 
these seeds grow better 
with the temperature at 
30 degrees centigrade but 
will grow with it be- 
tween 18 cent. 35 cent. 


Germination was not used by this pupil, 
who substituted “thrives” and “grows better.” 


The function of the control factors was not 
understood initially. Later, the pupil recog- 
nized an optimal temperature range for ger- 
mination. 

The improvement shown in these conclu- 
sions is the restriction of the final response 
to particular seeds. A slight error was noted: 
“temp. of 35 C or above is too warm”. 


SUMMARY 


A consideration of the pupil’s responses 
suggests certain characteristics of incorrect 
inferences. If the response repeats the data 
or is absurd, the pupil making the statement 
probably has not acquired a minimal reading 
comprehension and in these instances the 
pupil has a score of zero on the three cate- 
gories. Other factors may be involved, such 
as the experience of the pupil. Where the re- 
sponse is incomplete, certain portions of the 
data are disregarded or a hypothesis is not 
formulated. In instances where the pupil goes 
beyond the data or makes a false statement, 
the error may involve the experimental or- 
ganism, the basis of comparison, the control 
factors, or the experimental factor. 

The basis of a correct response probably 
depends upon the recognition of these ele- 
ments: (1) the experimental organism, (2) 
the experimental factor or factors, (3) the 
basis of comparison, and (4) the control fac- 
tors (constants). These elements may be con- 
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sidered to be sub-concepts that are requisite 
to the formulation of the total concept or 
relationship to be inferred. It is possible that 
a pupil may come to recognize the sub- 
concepts without making the major inference. 
However, the evidence is too scanty to test 
this hypothesis. 
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A METHODOLOGICAL STUDY OF THE LEARNING 
OF CHEMICAL CONCEPTS AND OF CERTAIN ABILITIES 
TO THINK CRITICALLY IN FRESHMAN CHEMISTRY’ 


HERBERT A. THELEN 
University of Chicago 


INTRODUCTION 


The educative process provides students 
with certain opportunities for experiences 
that are expected to result in desirable devel- 
opment on the part of the learner. The under- 
standings that govern the selection or plan- 
ning of these opportunities constitute the 
rationale of the particular teacher. In general, 
the sources of rationales in education are per- 
sonal experience, tradition, and authority, 
with the addition of a large “dash” of expe- 
diency. Such rationales are not necessarily 
“bad”, but they are often in conflict with one 
another, because they are subjective, and 
they may be scientifically unsound, because 
their sources are scientifically unsound. A 
proper theory of education, on the other hand, 
has its sources in objective experimentation, 
and it keeps the facts distinct from its 
assumptions, although it comprises both. 
Sound educational theory is not “true” in any 
absolute sense, but its conceptual components 
are consistent, its salient relationships are 
demonstrable, and its scope is comprehensive 
enough to describe and account for all known 
factors affecting the educative process. The 
development of sound comprehensive theory 
and, through it, the improvement of educa- 
tional practice are the primary aim of research 
in education. 

The link between theory, stated at a gen- 
eral and hence abstract level, and practice, 
which can exist only at the level of operation, 
is method consistently implemented with 
appropriate learning procedures. Within the 
limits of the experiment described in the 
present report, the experimental method and 
procedures were designed to put into practice 
a broad theory of science education that 
attempts to comprehend the nature and con- 
trol of behavior, the nature of learning in 
science, and the personal-philosophical and 


1“An Appraisal of Two Methods for en, Scientific 
Thinking in General Chemistry”. U: Ph.D. dis- 
Sertation, Department of Education, Chicago, 
1944, (Under the direction of Professors . Tyler, 
Karl J. Holzinger, and Stephen M. Corey.) 
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societal values whose operation is essential to 
personal and societal security. The control 
method and procedures put parts of the theory 
into practice, but their source in uncontrolled 
experience precludes a close connection with 
more than fragments of the theory. 


PURPOSES 


The purposes of this experiment may be 
stated at the levels of practice, method, and 
theory: 

1. At the level of practice the major pur- 
pose is to demonstrate that, by modification 
of the existing procedures of instruction, a 
wider range of abilities in the area of scientific 
thinking can be learned without the “sac- 
rifice” of subject-matter content. 

2. At the level of method the major pur- 
poses are: (1) to devise a method of teaching 
chemistry that is consonant with a stated 
and consistent theory,? and (2) to make con- 
crete suggestions for the improvement of 
freshman chemistry within the present admin- 
istrative framework of instruction. 

3. At the level of theory the major pur- 
poses are: (1) to formulate a general theory 
of science education (outlined in the original 
report), and (2) to validate certain theoret- 
ical principles by demonstrating that (a) 
they may be used to predict the results of 
instruction, and (b) they may be arrived at 
inductively from reflection upon the learning 
experiences and the outcomes of teaching. 


DELIMITATION 


1. The population of the experiment rep- 
resented about 60 percent of the freshman 
students enrolled in afternoon sections of 
general chemistry for engineers, chemistry 
114-124, during the academic year, 1941- 
1942, at Oklahoma Agricultural and Mechan- 
ical College, Stillwater, Oklahoma. 

2. The methods differences were applied 
during the four-hour quiz-laboratory sections 

2 Chapter II of the complete report. 
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attended once a week by each student. The 
students were not segregated during two one- 
hour lectures attended each week. 


3. The objectives studied were behaviors. 


that both methods strive to realize. The eval- 
uated abilities may be stated generally as 
“ability to identify the most appropriate 
meaning of terms” (knowledge of concepts), 
and five abilities composed of specific skills 
believed to be relevant to “scientific think- 
ing”.* 

4. The subject matter studied was the same 
in all classes, although its organization was 
somewhat different under the two methods. 

5. The evaluational evidence was obtained 
from a four-hour objective test administered 
at the beginning, middle, and end of the year; 
31 scores for each student were obtained from 
each administration of the test. Each score 
provided a quantitative description of pre- 
sumably homogeneous kinds of responses to 
items in the six sections of the criterion test.‘ 

6. Other evidence used to describe the stu- 
dents and to assist in some of the interpre- 
tations came from a questionnaire made by 
the experimenter and from Progressive Edu- 
cation Association Test 8.2a, Interest Index; 
these were administered at the close of the 
experiment. Also available were scores from 
the tests of the freshman week battery: 
A.C.E. Psychological Examination, 1939 (L, 
Q, and total scores); Cooperative English, 
Algebra, and Arithmetic Tests, 1939 (total 
scores); and either the Iowa Chemistry 
Training Test, 1930, CT-Y or the Iowa 
Chemistry Aptitude Test, 1930, CA—X (all 
scores). 


INSTRUCTIONAL METHODS 


1. The control method was the set of pro- 
cedures that have been worked out at the 
College over a period of at least ten years. 
Two control groups were employed: (1) 
“semi-micro”, making use of small-scale lab- 


® This does not mean that these are the only objectives 
striven for under the two methods. Learning experiences, like 
other experiences, according to our theory, in either planned 
or haphazard fashion will affect the interests, abilities, appre- 
ciations, attitudes, and other behavior determinants of the 
students, depending upon their areas of insecurity. A method 
that promotes critical thinking, for example, at the cost of 
such outcomes as social sensitivity, creativeness, cooperation, 
tolerance, self-direction, and aesthetic appreciation, is decidedly 
a poor method. See also Science in General Education, pp. 
41-45. New York: D. Appleton—Century Co., 1938. 

*Each homogeneous sort of response is assumed to reflect 
a specific kind or aspect of a learned behavior organization. 
Such organizations or abilities are in turn objectives of the 
course. Thus each ability is described from a pattern of 
several scores. 
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oratory equipment (for economy) and a lab. 
oratory manual that adapted the experiments 
for use with the midget apparatus; and (2) 
“macro”, making use of the conventional. 
sized equipment and a laboratory manual to 
guide the experimentation. 


The control groups spent the first hour of 
the quiz-laboratory meeting in reviewing 
topics discussed in previous lectures and 
summarized on a one- or two-page “quiz out- 
line’ prepared by the instructor in charge of 
the course. The ‘remaining three hours were 
spent in the laboratory following the direc- 
tions in the manual, recording observations, 
and writing answers to “interpretative” ques- 
tions. In the semi-micro groups, an additional 
feature was the writing of answers to several 
“preparatory” (orienting) questions before 
performing the experiment. 


In general, it is probably fair to say that 
the control method was closely representative 
of at least 80 percent of the instruction now 
being given in freshman general chemistry. 


2. The experimental method made use of 
the same subject-matter content as the con- 
trol method, and the experiments under the 
experimental method involved about the same 
chemical materials and processes. No labora- 
tory manual was used by the students. Each 
experiment was written up completely in 
proper form. 


The experimental groups ordinarily par- 
ticipated in a series of activities planned for 
the following purposes: 


a. Review of lecture material (optional). 

b. Follow-up of preceding week’s experi- 
ment. 

c. Drill or special activity focused on one 
of the major objectives of scientific 
thinking. 

. Group planning (class or small groups) 
of experimental work. 

. Performance of experimental work 
(pairs, small groups, individuals). 


The activities designed for these purposes 
were quite varied, including such features as 
silent demonstration, volunteer individual 
demonstration, black-board drill, student 
chairmanship, and a number of special 
mimeographed guide sheets given to the 
students. 
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FEATURES OF THE EXPERIMENTAL METHODS 


1. The class should value the learning 
objective for €ach activity. Such approval is 
sought through: 

a. Periodic class discussions appraising: 

1. Knowledge and abilities attained so 
far in the course. 

2. Re-interpretations, as course out- 
comes, of the importance of various 
sorts of abilities. 

. Continual reformulations of princi- 
ples in the light of additional 
knowledge. 

4. Consideration of general programs of 
study worth pursuing next. 

b. A few minutes at the beginning of each 
activity (or other strategic time) to 
establish a favorable “set” and to de- 
velop recognition of the need for the 
activity. 

c. Incidental indentification of real be- 
haviours as planned-for outcomes in 
specific activities. 


2. The class should thoroughly understand 
the procedures of each activity. This under- 
standing is the result of: 


a. Planning the activity in group discus- 
sion; as facility in planning group action 
grows in the group, the requisite detail- 
edness of the planning decreases: 

1. At the beginning of the year, all 
aspects of the plan came from stu- 
dent suggestions, modified by group 
criticism summarized by students and 
written in notebooks and on board. 

. At the middle of the year (or when- 
ever deemed reasonable), problems 
were parceled out to small groups for 
the entire planning and execution; 
the problem was concluded in report 
to and discussion by the class. 

. At the end of year (or whenever 
deemed reasonable), planning was by 
individuals with reference to stand- 
ards and responsibilities set by group. 


3. The class should learn to attack all 
problems in a sound scientific manner. This 
entails: 


a. Stating the problem for experimentation. 
The problem was usually a relationship 
to be formulated. 


b. 


Listing all factors believed to affect the 
relationship, and suggesting the nature 
of the mechanism for each. 


. Evaluating each factor through predic- 


tion of the consequences of its variation 

during the experiment and deciding: 

1. Which factor to vary, and which to 
observe. 

2. Which factors should be held con- 
stant. 

3. Which factors may be assumed to 
have negligible effect. 

4. Which factors are irrelevant. 


. Planning a procedure to enable proper 


variation, observation, and control of 
factors. 


. Conducting the. experiment, collecting 


data. 


. Organizing data, and inducing general- 


izations. 


. Criticizing the experiment, pointing out 


possible improvements, and estimating 
magnitude of errors. 


. Applying learned generalizations in new 


situations. 


4. The class should consciously function as 


a primary group. This entails: 


a. 


Deciding as a group when group action 
is appropriate to achieve desired specific 
outcomes. 


. Parceling out responsibility for parts of 


extensive experiments to volunteers, and 
then integrating and organizing the 
individual contributions. Personal values 
consequent upon the experience of re- 
porting to the group, and saving of time 
for other valued activities (experiences) , 
are motivating factors. 


. Assisting individuals to formulate their 


ideas, applying criteria of tenability and 
procedure, and appraising resulting con- 
cepts through application of group- 
determined standards. 


. Applying group censure or encourage- 


ment to students whose behavior war- 
rants. 


. Working under student chairmen in 


activities in which the teacher (as ex- 
pert) believes chairmen will be effective. 


. Formulating as a group the rules for in- 


dividual conduct in such matters as 
laboratory house-cleaning, where the 
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need for such rules arises from the ex- 
perience of the group. Delegation of 
responsibility for the enforcement of 
such rules to members of the group. 


5. The class should follow up each experi- 
ment by analysis (in group discussion subse- 
quent to individual written reports) of: 


a. Sources of error, due to poor technique, 
design of apparatus, or order of proce- 
dure. Magnitude of the error is esti- 
mated on common-sense grounds (of 
previous experience in and out of class). 

. Formulation of reasonable generaliza- 
tions from experimental data. 

. Suggestion of subsequent appropriate 
activities. 

. Identification of cultural areas in which 
discovered generalization seems opera- 
tive, and of values implied in its opera- 
tion and control. 

. Continual re-examination of previous 
knowledge, resulting in its reorganization 
and more precise definition. 


Although the comparison of the outcomes 
under the methods was limited to objectives 
in the area of scientific thinking, the experi- 
mental method was designed to contribute to 
the realization of many other partiaily identi- 
fiable objectives. The class was regarded as 
a primary group; emphasis was placed upon 
acquisition of responsibility for collection of 
particular data, and for leading discussion by 
individuals in class. In discussions the “slant- 
ing” tended to be toward societal purposes, 
and philosophical orientation. The possibil- 
ities in these areas, it is believed, were scarcely 
explored. The problem of providing “rich” 
activities and of guiding participation of stu- 
dents into channels most worthwhile to them 
emerged as the central problem involved in 
the use of the method. 

The experimental method as here employed 
was consistent with the practices developed 
under the leadership of L. F. Foster at Uni- 
versity High School, Oakland. 


DESIGN OF THE EXPERIMENT 


1. Formation of the samples—An effort 
was made to assign the students to particular 
quiz-laboratory sections in such a way that 
there would be approximately congruent dis- 
tributions of the traits measured during fresh- 
man week in each section; presumably, by 
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this means the largest samples could be 
obtained. The procedure failed, because the 
assigned times could not be utilized by many 
of the students, since many did not complete 
their enrolments, and because a large number 
of students enrolled in the course at the con. 
clusion of freshman week. The samples as 
finally composed represent a population of 
students whose initial characteristics were 
between those of students who take one 
semester and those who take a year oj 
chemistry. There was only about a 60 percent 
survival, and the “year” students were supe- 
rior to the semester students. To facilitate 
comparisons of growth during the two semes- 
ters, some of the poorer students were elimi- 
nated from the semester samples, and some 
superior students were eliminated from the 
year samples. The distributions of initial 
scores for general accuracy on the six sec- 
tions of the criterion test were very similar 
among the three semester and the three year 
samples, so that gains during the second 
semester could be inferred from measured 
gains during the first semester and during 
the year. 

The samples were stratified with respect to 
per cent of students who had taken high 
school chemistry, per cent of students taught 
by each instructor, and roughly by per cent 
of girls. 

2. Administration of instruction—The in- 
struction in the control groups was guided by 
the professor in charge of the course through 
the medium of the weekly typed quiz-outlines, 
and through bi-weekly staff meetings during 
the first semester. The laboratory instruction 
was confined to hints, warnings, and mechan- 
ics; the quiz-discussion was guided by the 
instructor as he wished, and ranged from 
repetition of the regular lectures to occasion- 
ally irrelevant discussions. This conformed 
closely to the usual procedure over the pre- 
ceding years. 

The instruction in the experimental groups 
was guided by typed, detailed lesson plans 
devised by the experimenter. There was sel- 
dom more than ten minutes of time during 
the four hours that was not planned for, and 
in spite of the greater complexity of the 
activities and the wide differences in under- 
standing by the instructors of the range of 
objectives, it is probable that there was as 
much uniformity of overt activity in the ex- 
perimental groups as in the control. Salient 
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features of the lesson plans were discussed, 
usually for about twenty minutes, with each 
instructor. The experimenter also asked for 
and received help from the instructors in 
planning follow-up activities for the experi- 
ments, and in deciding when a “change of 
pace” in the types of activities seemed psy- 
chologically appropriate. 


3. Teachers—The use of eight different 
teachers to give instruction during the year 
may have reduced differences in gains among 
the methods. The teachers were chiefly young 
instructors who were sympathetic to the ex- 
periment, but whose previous training and 
thinking had been almost exclusively in 
chemistry or biological science. It seems rea- 
sonable to conclude that any systematic dif- 
ferences between outcomes of the methods 
were due to the methods. In this case, there- 
fore, it was demonstrated that the experi- 
mental method was “practical”, since inex- 
perienced teachers with little supervision were 
able to apply it. 


4. Subject matter —An effort was made to 
treat the same subject matter under both 
methods, so as to rule out possibly unique 
interactions with subject matter as the cause 
of learned differences in behavior. This need 
for covering a wide range of subject matter, 
determined without much reference to educa- 
tional objectives, is one source of discrepancy 
between the ideal and actual forms of the 
experimental method. 


5. Collection of evidence—The pre-, mid-, 
and post-test was the criterion battery re- 
ferred to above. This test was constructed 
along the lines identified and established by 
Ralph W. Tyler: general description of the 
ability to be evaluated, identification of spe- 
cific behaviors to be taken as evidence for the 
ability, and description of the ability from a 
pattern of sub-scores. The validity of such a 
test at the operational level is thus “built 
into” the test, since the test and criterion are 
identical at this level. Other evidence of val- 
idity of the test is the agreement between 
prediction and results of testing. 


Data from a questionnaire dealing with 
hobbies, favorite sports, academic plans, etc.; 
from PEA Interest Index 8.2a; and from 
several of the freshman week tests are pre- 
sented in describing the students fairly com- 
pletely, so that the type of students to which 
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the findings are applicable may be known.° 
These data are also used to describe slight 
differences among the methods groups; these 
differences, however, are treated as randomly 
distributed uncontrolled factors affecting the 
estimated error variance. In a few cases 
knowledge of these differences assisted in the 
interpretation of difference among patterns 
of scores. 


METHOD OF THE EXPERIMENT 


The major requisite of the method is that 
it leads to valid interpretations of observed 
data. Such interpretations are not “true”, but 
only probable, since human experience has a 
large number of dimensions and hence many 
degrees of freedom that may be operated on 
by “chance”. In physical science, validity is 
obtained through the use of valid procedures; 
it is assumed that, if every step in the proce- 
dure is demonstrably sound, then the results 
must be sound. The counterpart of this view 
in education would hold that, if the measur- 
ing instruments define the behaviors to be 
discussed, and if the operational details of the 
methods are clearly different and practical, 
then the results should be valid (assuming 
reasonable contro] in selection of samples). 
If, however, the anticipated interpretations 
of the data are theoretical principles, then 
there is a much more powerful concept of 
validity available; this concept arises de- 
ductively from the nature of theory. It is 
generally agreed that sound theory: (a) is 
consistent with other sound and relevant 
theory, and (b) may be used to predict re- 
sults in situations similar to those from which 
the theory derives. Although any one statistic 
may agree with theory by “chance” alone, a 
pattern of anticipations is most unlikely to 
eventuate through operation of the randomly 
distributed uncontrolled factors that we call 
“chance”, for, by definition, chance cannot 
be structured in any anticipated fashion. 

In this experiment both concepts of validity 
are employed. First, the measuring instru- 
ments presumably elicit (via paper and 
pencil) specific behaviors that are observed 
in the classroom. The methods are operation- 
ally described, and it can be shown that the 
activities of each day are in harmony with 

5 The intelligence of the students was close to the national 
average, but was considerably more “quantitative” t 
“verbal”. The interests were preponderantly “practical sci- 
ence”. More than half of the students held approximately 


half-time jobs. Their cultural interests reflected those of 
midwestern towns (median population of 10,200). 
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the stated criteria of each method. The 
samples of students are composed of similar 
strata, and are shown by analysis of variance 
and the L-test to be closely representative of 
the same population. Second, this experiment 
is concerned with a teaching-learning situa- 
tion that is typical of those for which the 
theory of science education is relevant. There- 
fore, the theory may be used to set up a large 
number of hypotheses describing the results 
anticipated through application of the theory. 
The hypotheses, of course, must be stated in 
such language that it is possible to assess the 
degree of agreement between results and pre- 
diction. 

The hypotheses of this experiment are of 
several sorts: 


1. Specific hypotheses—Prior to the con- 
duct of the experiment, but after the methods 
had been described, it was possible and appro- 
priate to predict from knowledge of theory 
that the experimental method should prove 
superior to the control in increasing status 
with respect to all but four of thirty-one 
measures of the criterion test. After the ex- 
periment was concluded, but before the tests 
had been scored, it was possible and appro- 
priate to examine the number of activities in 
which each of the methods differences had 
been applied and to estimate the reliability 
of the anticipated differences. This additional 
refinement made the hypotheses more precise 
and hence more easily testable. The range of 
possible reliabilities was divided arbitrarily 
into three segments expressed as percent level 
ranges and chosen arbitrarily: 


a. Reliability of differences falls between 
o and 1o percent levels. Ten of the 
thirty-one hypotheses predicted this 
order of magnitude of reliability of dif- 
ferences between experimental and con- 
trol groups. 

. Reliability of differences falls between 
10 and 30 per cent levels. Seven of the 
thirty-one hypotheses predicted this 
order of magnitude of reliability of dif- 
ferences between experimental and 
control groups. 

. Reliability of differences falls between 
30 and 100 per cent levels. The remain- 
ing 14 of the 31 hypotheses predicted 
this order of magnitude of reliability of 
differences between experimental and 
control groups. 
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These three ranges may be labelled respec. 
tively as standing roughly for “significance”: 
“fairly reliable, significance could probably 
be achieved by better saturation of methods”: 
and “unreliable, better saturation of methods 
differences in subsequent experiments may or 
may not produce more reliable differences”, 
However, it is not necessary to attach labels 
to these ranges, since their purpose is mainly 
to provide more definite criteria for checking. 

The reliabilities of the differences between 
the two control groups were predicted to fall 
in the range from 30 to 100 percent. 


Since pre- and post-scores were obtained, 
it was deemed appropriate to employ a sta- 
tistical method that takes account of differ. 
ences in gains. Since it was anticipated that 
there would be some correlation between pre- 
and post-scores (especially in the control 
groups), it was presumed that part of the 
difference between pre- and post-scores could 
be predicted by means of regression alone. 
Furthermore, the samples were known to be 
closely similar initially. These considerations 
make applicable the use of the method of 
analysis of covariance, which was used.° 


2. Interpretative hypotheses—During the 
formulation of the experimental method, a 
series of objectives, such as “ability to visu- 
alize simple experiments from verbal descrip- 
tions”, “ability to distinguish effects from 
their causes and from their illustrations”, and 
the like, was identified. While direct evidence 
was not obtaincd from the test scores with 
reference to development of these behaviors, 
it was believed that careful study and com- 
parisons among the scores, coupled with anal- 
yses of students’ experimental reports and 
other work, could shed light on the relative 
attainment of these presumed abilities. Actu- 
ally, however, this list was of considerable 
importance in guiding instruction, but evalu- 
ations of these hypotheses were not attempted. 

3. Methodological hypotheses—The con- 
firmation of a large number of the specific 
hypotheses, plus satisfactory subjective evi- 
dence that other objectives were attained as 


*In a few cases in which the correlation between initial 
and final scores in the population was low, and in which the 
gains were approximately the same size as the initial differ- 
ences, this method actually imputed superiority to the in- 
ferior group. These anomalies occurred only in the 30 to 100 
percent range and are of little consequence, since such differ- 
ences are not informative. The existence of such occasional 
anomalies might be obviated by use of separate b’s for each 

oup; their existence is not considered to be sufficient justi- 
cation for the usual practice of assigning an arbitrary level 
as “‘significant’’ and rejecting all differences at a higher per 
cent level. 
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well or better in the experimental groups as 
in the control group, could be regarded as 
some evidence that the experimental method 
is a “better” method of teaching chemistry. 
It is probably safe to say that the principles 
stated above as the features of the experi- 
mental method are revealed as leading to 
more effective instruction than do the fea- 
tures of the control method. 

4. General hypothesis——The basic under- 
lying hypothesis of the experiment is that 
“through employment of instructional activ- 
ities guided by the experimental method, stu- 
dents will learn as much or more ‘subject 
matter’ as under the control method, and will 
make substantially greater progress in devel- 
opment of desirable specified abilities in the 
area of ‘critical thinking’.” This hypothesis is 
confirmed, although the meaning of “substan- 
tially greater” is dependent upon the assump- 
tions one makes about “educational impor- 
tance” as outlined below. 


SUMMARY OF FINDINGS: AGREEMENT 
BETWEEN SPECIFIC HYPOTHESES 
AND RESULTS 


In the comparisons between the experi- 
mental group and each of the control groups, 
it was found that 13 hypotheses were correct, 
and that in 9 cases the reliability of the dif- 
ference in gains was greater than predicted. 
This means that 71 per cent of the hypotheses 
were either correct or conservative and, in 
general, the achievement of the experimental 
group was slightly better than anticipated. 
Of the remaining 18 comparisons of experi- 
mental group with each of the two control 
groups, 4 hypotheses called for more reliable 
differences than found, and the remaining 14 
identified the wrong group as superior. Of 
these cases of wrong identification, 11 con- 
cerned reliabilities at the 30-100 per cent 
level and 3 at the 10 to 30 per cent level. 

In the comparisons between the two con- 
trol groups, 21 hypotheses correctly predicted 
that the difference in gain would be reliable 
only between the 30 and roo per cent levels. 
Eight of the 9 differences reliable at the 10— 
30 per cent level favored the semi-micro 
group, and the one difference reliable at the 
0-10 per cent level favored the macro group. 

The 31 scores consist of 6 scores for gen- 
eral accuracy (one score for each section of 
the test), 16 subscores appraising specifically 
described types of accuracy, scores for over- 
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cautious and beyond data behavior on three 
sections, scores representing two other specific 
errors, and one score for failure to mark the 
items in a section because of alleged lack of 
knowledge. The general conclusions from 
these comparisons are: 


1. There were ‘differences in instruction 


under the two methods. 


2. There was correspondence between the 
nature of the learning experiences and the 
consequent learnings. 

3. The bases for prediction of the extent 
of correspondence were pragmatically sound, 
implying that: 


a. The identification of the experiences 
relevant to differences in achievement 
was accurate. 


b. The estimates of the magnitudes of dif- 
ferences in instruction required to pro- 
duce reliable differences in outcomes 
were in the main correct, but tended to 
be conservative. 


. The aspects of theory linking classroom 
experiences with achievement, in the 
main, were sound, since predictions were 
possible. 


EDUCATIONAL IMPORTANCE OF THE 
DIFFERENCES IN ACHIEVEMENT 
UNDER THE Two METHODS 


Data are presented in the complete report 
to show that the mean gain of the group 
making the larger gain in most cases falls 
well within the range of minus one sigma to 
plus one sigma of the distributions of gains 
in the other groups. Furthermore, it is found 
that the greatest gain of any group on any 
score is 58 per cent of possible gain, and that 
the mode is more nearly 25 per cent of the 
possible gain. 

The “educational significance”, generally 
thought of as the amount of difference in 
achievement per unit of teaching energy ex- 
pended, is a concept of minor importance in 
an experiment of this type. Among the rea- 
sons for this assertion are: 


1. The purpose of this experiment is to de- 
scribe correspondence between modification 
of instruction and modification of outcomes. 
The differences in outcomes are not required 
to be larger than required for “significance” 
by the method of covariance and, indeed, 
there is no advantage in the establishment of 
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the relationships to finding reliabilities of dif- 
ferences at less than the 1 per cent level. 


2. This experiment is not at all concerned 
with the cost of each difference produced, 
although it. is interested in discovering 
whether the experimental method is “prac- 
tical”. 


3. This investigation has tested only one 
instance of the experimental method. In any 
other instance, the method would be the same, 
but the procedures would probably be differ- 
ent. In other words, the degree of saturation 
of the methods is not rigorously defined, so 
that it is not possible to claim that the “edu- 
cational significance” of differences in 
achievement can be used to appraise the 
methods being compared. (However, they 
would appraise the two series of procedures 
used in this particular experiment.) The 
writer’s appraisal would be that the control 
methods, making use of relatively. few oft- 
repeated procedures are nearly saturated, 
whereas the experimental method as here 
applied nowhere approaches saturation. 


4. Much work would be required to decide 
whether the learning outcomes in the experi- 
ment are primarily convergent or divergent 
phenomena. (This is outside the scope of this 


study.) A ro per cent increase in ability to 
interpret data, for example, might imply a 


change in outlook that would increase 
throughout a lifetime, or it might imply the 
partial mastery of a particular specific be- 
havior that the student will never use again. 
The justification for research of this type re- 
quires the assumption to be more nearly like 
the former than the latter evaluation, and 
there is some indication within the experi- 
ment that this assumption is appropriate. 


RESULTS AND INTERPRETATIONS 


The data from each section of the criterion 
test, corresponding to each of the six “abil- 
ities” studied, have been analyzed in detail. 
In this report are presented the findings with 
respect to two theoretical principles, in an 
attempt to show precisely how these principles 
operate in the context of freshman chemistry 
learnings. These principles .are concerned 
with: : 


1. The relationship between “directness of 
experience” and the learning of concepts.” 
7 Based on Chapter VI of the complete report. 
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2. The relationships between experience, 
verbalization, and “transfer” in the learning 
of the attitude of “overcaution” with respect 
to the interpretation of data, the planning of 
experiments, and the predicting of results of 
chemical action.® 


Suggestions are made for modification of 
the present chemistry curriculum. 


I. “DIRECTNESS OF EXPERIENCE” AND THE 
LEARNING OF CONCEPTS 


1. Nature of the Objective 


Observations of phenomena, descriptions 
of procedures in the laboratory, methods of 
working, explanations, and predictions in 
chemistry are communicated by means of an 
extensive vocabulary of words signifying con- 
cepts whose meanings must be understood. 
Some of the different sorts of concepts for 
which words stand may be identified: 


a. Materia.—“Sodium,” “sulfuric acid,” 
and “beaker” are symbols for chemical mate- 
rials or equipment. They are learned primarily 
through association of word with object. 
These words are the simplest to learn, but 
even these simple words are multi-ordinal; 
thus a beaker is: (1) a certain shape of glass 
object, (2) a container for solids and solu- 
tions, and (3) a reaction vessel. Further, a 
beaker may be any one of a number of sizes, 
and the student must learn to qualify the 
word “beaker” with such phrases as “250- 
milliliter.” The multiordinality of the terms 
for compounds is a very common source of 
difficulty. “Sodium sulfate” may be used to 
designate the contents of the fourth bottle 
on the third shelf; it may be regarded as the 
symbol for a certain white crystalline solid; 
Or it may refer to a complex of properties of 
a compound, as described below. 

b. Summaries of properties and processes. 
—The student of chemistry must learn to 
associate observed properties with names. 
This is one sort of ability commonly included 
under the general heading of “ability to apply 
principles.” Around the term, “sodium sul- 
fate,” the student needs to learn to marshal 
a large cluster of concepts: (1) this compound 
is a sodium salt, and hence is soluble; (2) it 
contains sulfate ion, and hence will give a 
precipitate with acidified barium chloride 
solution; (3) it may be produced by the 
action of sulfuric acid on sodium hydroxide; 

® Based on Chapter XI of the complete report. 
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(4) 1 gram-molecular weight of this com- 
pound will raise the boiling point of water by 
1.56 degrees C, and will lower the freezing 
point by 5.58 degrees; (5) a water solution 
of this compound will conduct electricity; 
and so on, through a long list of behaviors. 

The summary of properties may include 
properties of a single specific compound, or 
may represent an abstraction, or class-name. 
Thus “acid” symbolizes a group of commonly 
observed behaviors, such as turning blue 
litmus red, catalyzing esterification reactions, 
and producing hydrogen by action on active 
metals. “Synthesis” is an abstraction of a 
common-element in a great many reactions 
and processes, and “distillation” calls to mind 
a set-up of apparatus and a sequence of oper- 
ations. It is probable that these summary 
words are best learned inductively, for one 
may experience operationally the properties 
the words abstract. The sound, “acid,” when 
learned thoroughly, should “call to mind” a 
sour taste, a glowing light bulb, a warmed 
test-tube, and/or a stream of gas bubbles, 
depending upon the gestalt in which the 
symbol appears. 

c. Theoretical concepts——There are a few 
substances and a great many processes that 
came into chemistry as a result of the quest 
for mechanisms to explain and predict chem- 
ical behavior. These theoretical notions them- 
selves are not available to sensory perception, 
but are the result of comparisons and analy- 
ses among summaries of observations. “Oxi- 
dation,” for example, is a broad theoretical 
concept of increase of valence. The mechan- 
ism associated with the effect is that of losing 
electrons (“electrons” is another theoretical 
concept). The observation that some sub- 
stances combine with oxygen is an illustra- 
tion of the oxidation process, and, at an ele- 
mentary level, oxidation is sometimes defined 
as “combination with oxygen”. When most 
usefully understood, these concepts are in- 
sights, products of reflection, that may be 
applied to a wide range of seemingly-different 
phenomena. It is likely that theoretical con- 
cepts are usually learned verbally from lec- 
tures, discussion, and text, but they are non- 
functional, unless the student has had con- 
siderable experience in applying them to 
problems set in the laboratory or described in 
textbooks and workbooks. 

d. Concepts for convenience —“Equivalent 
weight” and “absolute temperature” are con- 
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cepts that have theoretical significance, but 
the student’s chief need for them is as a con- 
venience in solving problems efficiently. Stu- 
dents often try to solve problems in acidi- 
metry or kinetic-molecular theory by cumber- 
some methods involving substitution of less 
appropriate notions; although the answer may 
eventually turn out the same, the process is 
unnecessarily wearisome. In such cases the 
difficulty is usually a failure in ability to state 
the precise differences between “equivalent 
weight” and “gram-molecular weight,” and 
between “absolute temperature” and “degrees 
centigrade.” In general, there is need to know 
from where the term comes and why it is 
used. 


It is clear that the precise learning of con- 
cepts requires not only the application of 
criteria for generality and specificity of terms, 
but also appropriate understanding of their 
operational, relational, and theoretical asso- 
ciations. 


2. Teaching of the Objective 


In helping students learn chemical concepts, 
the teacher may select from among three 
courses of action: (1) he may make no spe- 
cial provision of activities designed to lead to 
desirable habits, skills, and critical abilities 
in the use of chemical terms; (2) he may 
seize such opportunities as arise during the 
class and laboratory work for pointing out 
generalizations about the use of terms or for 
making observations about their etymology 
or definition; or (3) he may carefully plan 
activities to promote in students various gen- 
eralized behaviors that his analysis has con- 
vinced him are important to the objective. 

In this experiment, the teaching of the 
objective followed plan 1 under the control 
method as employed in the semimicro and 
macro classes; this plan involves the assump- 
tion that the ability to use terms correctly is 
an automatic concomitant of the learning of 
“chemistry.” Aside from the inclusion in the 
laboratory manual of a few questions dealing - 
with definition, little was done that may be 
expected to contribute directly to the objec- 
tive. Under the experimental method, the 
teaching followed plan 2, in that the knowl- 
edge of terminology was an objective con- 
comitant to other objectives. A fairer state- 
ment might be that the basic assumption in 
the experimental methodology is that the 
experiences of the students may be planned 
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to satisfy simultaneously a variety of objec- 
tives, including knowledge of terminology. 
Accordingly, the devices used to develop 
knowledge of terminology are casual and 
usually are merely an aspect of the far richer 
class activity. Among the devices employed 


a. Definitions of terms formulated through 
class discussion and written on the 
board. 

. Application of criteria of definition to 
the criticism and evaluation of defini- 
tions suggested in class. 

. Discussion of the etymology of such 
words as “anhydrous.” 

. Recognition of the different purposes 
and levels of definition, and selection of 
the most appropriate definition. Thus, 
“reduction” may mean “loss of oxygen,” 
“decrease in valence,” or “gain of elec- 
trons.” 

. Provision of opportunity during discus- 
sion for students to receive aid from 
teacher and class in making their mean- 

_ ings clear. 

. Emphasis upon conciseness and clarity 
in the individual written laboratory re- 
ports. These reports were carefully read 
and marked by the instructor, and the 
most common errors and misconceptions 
were subsequently discussed in class. 

. Formulation of concepts only after ex- 
periences designed to make the concepts 
meaningful. 

. Placement of emphasis upon obtaining 
meaning from the book. Although this 
too often took the form of exhortation, 
two special activities (reading the book 
line-by-line and locating definitions and 
arguments in the book) were employed. 


3. Appraisal of the Objective 


a. The objective of desirable changes in 
behaviors—Reflection upon the observed 
difficulties of students in communicating 
knowledge suggested that the general behavior 
which might be used to appraise the objective 
is the distinguishing of the “best” statement 
of the meaning of a term from partially- 
correct statements that students often confuse 
for the “best” meaning.® Casting the situa- 


® This decision includes the assumption that ability to com- 
Pare given statements correlates highly with ability to volun- 
teer correct statements. There is some evidence in certain 
kinds of exercises that this is true. The interpretations in this 
section are confined to the objective as defined by the measur- 
ing instrument. 
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tions in multiple-choice items makes possible 
the appraisal of ability to make distinctions 
of any desired degree of delicacy. In the test 
used, a coarse distinction is the recognition 
that a “factor” is “anything which may affect 
the result of an experiment” rather than a 
“person who seeks facts.” A fine distinction 
involves such things as identifying “weight” 
as “a force” rather than as “a mass’; the 
common confusion arises from the fact that 
“weight” and “mass” are used practically as 
synonyms in chemistry. Not only may dis. 
tinctions be delicate or coarse, but they may 
vary as to their nature, and a test that is to 
appraise the objective adequately must in- 
clude situations involving the more impor. 
tant sorts of distinctions. The test contains 
erroneous statements of the following types: 


1. Gross misconception: experimental error 
as “an experiment performed by mistake,” 
and the use of valence to decide whether an 
element is “deeply colored or transparent,” 
are examples of misconceptions. The choices 
involving misconceptions provided an oppor- 
tunity to suggest non-technical responses: it 
was expected that the initial frequency of 
marking misconceptions would be about that 
expected by chance. 


2. Too general statement: “amount”’ is a 
too general definition of “volume;” a very 
plausible definition which is too general, be- 
cause it does not control temperature, is that 
“air pressure” is a “measure of the number 
of molecules of air in a given volume.” Too 
general statements embrace too wide a range 
of occurrences, fail to control important fac- 
tors, or fail to be sufficiently specific to dis- 
tinguish between phenomena of the same 
class. 


3. Too specific statement: calling an 
“atom” “the smallest part of a metal that 
has the properties of the metal’’; restricting 
“evaporation” to situations involving “high 
temperatures”; and limiting “synthesis” to 
“forming a precipitate” are examples of de- 
fining terms too narrowly or too specifically. 


4. Statement of cause or mechanism for 
effect: “diffusion” as “the result of molecular 
impact” and “experimental error” as “due to 
incorrect way of reading instruments” are 
examples in this category. Prevalence of this 
sort of error usually suggests confusion of 
phenomena with their theoretical explanations 
or with relevant observations. 
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5. Statement of illustration for effect: “air 
pressure” may be illustrated incorrectly as 
“the weight of air in a tire,” “filtration” is 
illustrated in “separating water from mud.” 
The use of illustrations usually suggests fail- 
ure to generalize an experience to the most 
useful degree of abstraction. 

6. Statement involving misunderstanding 
of scientific method: since understanding of 
scientific method was a major objective in 
the experiment, it was desired to provide 
opportunity for the assertion of confusions 
about the terminology of scientific method. 
These confusions may include some of the 
sorts of errors listed above. Examples in this 
test include: a “fact” is something which 
“can be proved,” and “‘an experiment” is per- 
formed to “prove the law of cause and effect.” 

Terms may be improperly defined, if the 
student is confused in any of the ways indi- 
cated above. In addition, there are different 
“levels” of definition; a term may be defined 
operationally or in theoretical language. Three 
classifications: were used in building the test: 

1. Definition through application or func- 
tion: a “catalyst” may be defined as “some- 
thing which changes the speed of a chemical 
reaction,” and an “electrolyte” is “a solution 
which conducts electricity.” In these cases, 
the term acquires meaning through the useful 
characteristics of the phenomenon repre- 
sented. 

2. Literal definition: words like “hypo-,” 
“anhydrous,” and “equilibrium” may be de- 
fined on the basis of a knowledge of words 
rather than of phenomena. Such definitions 
are thought of as “literal definitions.” 

3. Technical definition: a highly theoret- 
ical statement about a “non-metal” is that 
“when reacting with other substances, it tends 
to take on electrons,” or that “temperature 
is a measure of the amount of kinetic energy 
of the molecules in a body.” In general, tech- 
nical definitions are relational in that the 
term is defined through its relations with 
other terms. 

The foregoing analysis of behaviors served 
a useful function in the construction of the 
test, for it made possible the inclusion of re- 
sponses that could sample a wide range of 
possible behaviors. The analysis was also 
useful in the conduct of class discussions, since 
the experience of making the analysis left the 
teacher somewhat more sensitive to specific 
sorts of difficulties that impair communica- 
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tion. There are, however, many commonly- 
held mistaken concepts that represent error 
of fact or principle, and they are not classi- 
fiable under the foregoing scheme, but they 
are patently involved in communication and, 
therefore, belong in this section of the test. 
While the foregoing analysis is suggestive, 
nonetheless the first criterion to be applied 
to every item is that it is valid, in the sense 
that the answers from which the student 
selects are the sort of things students ordi- 
narily would volunteer. Therefore, this anal- 
ysis does not define a list of specific objectives 
for which specific procedures are planned, but 
it does supply a partial critique for use in 
teaching the general objective. 

b. The objective of informational content. 
—The behaviors listed above can be exhib- 
ited only with respect to a situation—a 
specific pattern of stimuli that call for rele- 
vant behavior. The situation set in each test 
item is defined by the particular connotation 
of the particular term being tested. In gen- 
eral, the connotation being tested is either the 
most important one from the standpoint of 
further learning of chemistry or else it is be- 
lieved to be the most useful in discriminating 
between levels of achievement of the objec- 
tive. (The use of the most discriminating con- 
notation involves the assumption that the 
behavior it calls forth will correlate highly 
with the behavior elicited with respect to the 
most important connotation.’°) 


4. Preliminary Results and Assumptions in 
Treating and Classifying Data 


One may postulate that, besides the factors 
controlled by the experimental procedures, 
there are two sorts of factors affecting the 
learning of the concepts tested: 

1. Differences attributable to the nature 
of the concepts themselves and to the avail- 
ability of the concepts through various gen- 
eral learning procedures, such as laboratory, 
discussion, and lecture. 

2. Differences due to the method of super- 
vising and organizing the learning activities 
in the three methods groups. 

Inspection of the data shows that the learn- 
ing of the concepts tested in Part I of the test 
was quite similar under the two methods of 
_ 29 A discussion of the procedures used to select representa- 
tive terms and “‘reasonable’’ responses is omitted from this 
~ x" interesting to note that about 75 percent of the 


developed in the first semester, although the 
“meaning”’ of most of the terms changes throughout the year. 
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instruction. On the other hand, great differ- 
ences in initial scores and gains during the 
semester and year are noticed from concept 
to concept, and this suggests that a fruitful 
analysis of the data would be one which first 
identified and rationalized the gross differ- 
ences among: concepts and then studied the 
differences between methods groups as an 
additional factor. 

The first step in the treatment of these 
data consisted in classifying the results of the 
testing and in identifying groups of concepts 
whose learning appeared to be based on simi- 
lar experiences and whose nature appeared 
to be much the same. Before proceeding to a 
study of each category, it is desirable to pre- 
sent evidence that the categories named in the 
stub of the tables are clearly distinguishable. 
The outcome of this procedure is shown in 
Table I. 

Table II presents the gain in per cent of the 
group making the correct response under the 
experimental, semimicro, and macro methods. 


The classification of concepts is the same as 
in Table I. 
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Tables I and II show that the groups of 
items classified as indicated behave very dif. 
ferently on the test. Initial means in the cate. 
gories range from 10 per cent to 76 per cen 
mastery for the entire populations. Gains 
range from —7 per cent to 50 per cent in the 
methods groups. Little relationship is seen to 
exist between initial and final mean scores 
among the various categories, thus suggesting 
that the categories are independent. 

The classification: general statement. —The 
concepts contained in categories A through D 
may be said to have been “taught.” They 
were presented in lectures attended by the 
students unsegregated in methods groups. 
Most of the concepts were discussed or used 
in discussions of problems in the quiz 
laboratory sections, which were segregated 
into classes under the experimental, semi- 
micro, and macro methods of instruction. In 
these methods groups, assimilative experiences 
were directed. Textbook chapters (the same 
in all classes) were assigned and sometimes 
discussed; more-or-less assistance was given 
students working on problems assigned dur- 


TABLE I 


LEARNING OF THE CATEGORIES OF CONCEPTS BY THE ENTIRE POPULATION OF THE EXPERIMENT 


Average Per Cent of Possible Score (expressed as per cent of students 


First Semester 
Final 


Classification 
of Concepts 


Number of 
Concepts Initial 


Functional__-__-___- 


Common Usages-- 

Methods Implica- 
tions 

Not Taught 


75 
65 
25 
85 


24 
26 


marking items correctly) 
Year 


Final 
88 
71 
32 
86 


28 
24 


Gain Initial Gain 





Total, average _ __ 


55 


60 


TABLE II 


GAIN IN PER CENT OF STUDENTS IN EXPERIMENTAL SEMIMICRO AND MACRO 
, GROUPS MARKING ITEMS CORRECTLY 


Classification 


Functional 

Theoretical 

Common Usages--_- 

Methods Implica- 
tions 

Not Taught 


First Semester 
SM 


32 
14 
13 

9 


3 


—l 


Gain in Per cent of Group 

Year 
SM 
47 
17 
22 
10 


5 
0 


24 
17 
13 

6 


2 
4 





Total, average _ _- 


13 


12 18 
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ing lectures; the laboratory experiments were 
introduced, carried out, and followed up; and 
any special assignments were made. In addi- 
tion, various amounts of remedial help were 
given to students by their instructors outside 
of class hours. 

On the basis of these conditions, certain 
assumptions appear to be justified: 

1. Any differences in outcomes in the 
various methods groups cannot be attributed 
to lecture experiences, since the groups were 
not segregated by methods in the lecture 
sections. 

2. Although the effectiveness of the lecture 
presentation was probably not the same for 
the various concepts taught, it is likely that 
the assimilative experiences in the quiz- 
laboratory sections determined to a large 
degree the amount of learning that took place. 
If this assumption is correct, it should be 
possible to classify the items according to the 
general nature of the assimilative experiences 
and to find variation in outcomes among the 
categories attributable to differences in the 
nature of these assimilative experiences. 

3. Although the general assimilative experi- 
ences (laboratory, as distinguished from dis- 
cussion, problem work, etc.) may be the same 
under the various methods, the effectiveness 
of the experiences may be quite different due 
to differences in organization, motivation, 
follow-up, use of review, and the like. Such 
differences would be revealed mostly in the 
learning of individual concepts, although gen- 
eral superiority of certain kinds of procedures 
may be revealed from category to category. 


5. Brief Characterization of the Categories™ 


Category A, “experimental” —The concepts 
classified in this category are characterized 
as follows: 


1. Part or all of a laboratory experiment 
was experienced by the students for the 
purpose of learning the concept. 

. The experiment was direct and valid. 

. The concept may be obtained from the 
percept by simple abstraction. 

. The statement of the concept is couched 
in simple, relatively non-technical lan- 
guage. 

_ ™ The original treatment of this section includes the follow- 
ing information about each of the thirty concepts tested: 
the concepts thus categorized, information about their learn- 
ing as gleaned from analysis of right and wrong responses, 
information about the experiences likely to have resulted in 


the learning, and whatever evaluations of the effectiveness of 
the methods seem reasonable. 
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Summary, “experimental” category.—The 
data on the learning of the four concepts most 
directly illustrated through individual experi- 
ments are striking. In this category fall three 
of the five items of the entire test on which 
the gain of all three groups was significant 
at the one per cent level over the year’s work. 
The learning of the fourth concept in this 
group was significant for all three groups at 
the five per cent level; only five additional 
items in the test revealed gains at this level 
of significance for all three groups. These 
findings lend support to the notion that lab- 
oratory experiences from which concepts may 
be obtained through simple verbalization are 
most effective in promoting learning. 


In the experimental methods group, the 
procedures were designed to provide a closer 
connection between the laboratory experi- 
ences and the subsequent formation of con- 
cepts from these experiences. It is reasonable, 
therefore, to expect the experimental group 
to gain more significantly than the other 
groups. Table II shows that the superiority 
was marked over the first semester, but that 
all groups had reached about the same degree 
of proficiency with the “experimental” con- 
cepts by the end of the year. Consideration 
of the nature of the learning experiences pro- 
vided for the formation of the four concepts 
shows that two mechanisms were operative in 
leading to the superiority of the experimental 
sections: (1) in the case of “amorphous” and 
“electrolyte,” the concepts were studied ex- 
perimentally by the X groups, but not by the 
semimicro and macro groups during the first 
semester; (2) in the case of “anhydrous” the 
experiment was more effective in the experi- 
mental groups than in the other groups. 


Category B, “functional”.—The concepts 
classified in this category are characterized as 
follows: 


1. The concept is necessary for the descrip- 
tion or simple explanation of phenomena. 


. The concepts are stated in simple lan- 
guage and probably involve little reor- 
ganization of experience for their recog- 
nition. 

. The concepts have been illustrated in 
aspects of laboratory experiments, but 
it is likely that the relationship between 


concept and experiment has not been 
made clear. 
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These criteria distinguish between the 
“experimental” concepts and the “functional” 
concepts in terms of the closeness of their 
association with particular experiments. They 
are designated as “functional,” because they 
are among the handy conceptual tools needed 
for chemical communication; a person who 
cannot use these concepts may be regarded 
as chemically illiterate. 


Summary, “functional” category —Table I 
indicates that the learning of the “functional” 
concepts, while greater than in some other 
categories, was much less than the learning 
of the concepts directly demonstrated in ex- 
periments. Table II shows that the experi- 
mental groups were the most successful in 
the learning of these concepts, and analyses 
of the learning of each concept suggest that 
the reason is the closer relationship between 
the concepts and laboratory experiences. 


The learning of the concept of “synthesis” 
illustrates the possibility of a simple notion 
remaining vague and not becoming associated 
with any definite sort of reaction, even though 
the students certainly worked with such re- 
actions. The learning of the meaning of “one 
equivalent” is even more striking, for in this 
case the purpose of one experiment was to 
clarify the concept. The fact that the defini- 
tion required in the test was at a more theo- 
retical level than the definition presumably 
learned from the experiment may account for 
the poor learning; on the other hand, failure 
to be able to reformulate the outcome of the 
experiment in terms of the test item shows 
how superficial the learning from the experi- 
ment actually was. 


In general, the analysis of the learning of 
this category suggests very clearly that it is 
possible to focus experiments in such a way 
that the learning of important common con- 
cepts may be far more effective. Another way 
of stating the conclusion would be that the 
experiments, guided primarily through the 
laboratory manuals, were less efficient as aids 
to learning than were similar experiments, 
developed and followed up in class discussion. 


Category C, “theoretical” —The concepts 


classified in this category are characterized as 
follows: 


1. They represent “new” learning, in that 
the initial scores on the items were con- 
siderably lower than would be expected 
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by chance. For practical purposes, i 
may be said that the students had m 
initial knowledge of these concepts. 

. The concepts are theoretical and tend 
to be non-operational. They cannot bh 
demonstrated experimentally in any 
direct fashion, nor do they acquire any 
particular significance through reflection 
on common experience (with one excep. 
tion). 

. Little use of the concepts was required 
in the course. They were not mastered 
to the point of usefulness in prediction 
and explanation, although they were 
used for these purposes from time to 
time. 


Summary, “theoretical” category.—Inspec- 
tion of Table I suggests that the learning oj 
the theoretical items is about as great as the 
learning of the functional items, in spite of 
the more tenuous relationship of the concepts 
to concrete experience. On the other hand, 
the gain for the year on the functional items 
was 24 out of a possible 53 per cent, or 45 
per cent, whereas the gain on the theoretical 
items by the same method of calculation was 
but 24 per cent of the possible gain. The 
apparent superiority of the experimental 
group is small. Since the theory was related 
to laboratory work somewhat more in the 
experimental group, this difference is in line 
with the hypothesis and is thus more likely 
to be “real” than would be suggested through 
application of statistical techniques alone. 

Analysis of the interpretations of the learn- 
ing of the items in this category leads to cer- 
tain generalizations that seem reasonable in 
the light of performance: 


1. In five of the six items in this category, 
a majority of the students marked a 
particular wrong response at the end of 
the year. In three of these cases the re- 
sponse was actually incorrect; in the 
other two cases it was partially correct. 

. In general, the most popular response 
tends to be more operational than the 
correct response. 

3. The learning of the theoretical responses 
is specific and superficial and useless, 
rather than profound, insightful, and 
functional. 

. Learning of theory is dependent upon 
its use in interpreting concrete experi- 
ences. 
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In general, the learning of the theoretical 
items is thoroughly unsatisfactory. There is 
no doubt that theory is difficult to teach effec- 
tively, because it involves either: (1) a very 
broad experimental base which is hard to 
provide and difficult to interpret; or (2) a 
great deal of insight, so that relationships 
may be deduced from definitions. As in all 
curriculum construction, two questions are 
pointedly raised: (1) What theory is impor- 
tant (and for what reason)? and (2) How 
can it be efficiently taught? 

Category D, “common usages” .—The con- 
cepts classified in this category are charac- 
terized as follows: 


1. The concepts are known initially to a 
great majority of the students, and are 
iairly common in everyday communica- 
tion of ideas, both in life and in chem- 
istry. 

2. The meaning of the concepts is little 
changed by their use in chemical con- 
texts. 


The high initial accuracy with these con- 
cepts means that a gain significant at the 5 
per cent level would require that nearly every 
student mark the correct statement. The final 
average per cent mastery for the semester and 
year groups respectively would have to be 
gI per cent and 93 per cent respectively for 
significance at the 5 per cent level. It is clear 
that such items are not likely to be discrimi- 
nating among methods. Since the concepts are 
so generally understood, no attempt will be 
made to cite specific learning experiences, 
except as required for interpretation of results. 

Summary, category D, “common usages”. 
—Table 1 indicates that the terms were 
known initially to three-fourths of the group. 
Analysis of the responses shows that the re- 
maining quarter tended to mark relevant 
responses, suggesting that all the group “had 
an idea” about the concepts. The gains of 9 
per cent the first semester and 11 per cent 
during the year amount to 37 per cent and 
44 per cent respectively of the possible gains. 
Table II reveals that the gains in the experi- 
mental group appear to be slightly higher 
than those of the other group. 

In the case of two of the items, it is sug- 
gested that a correct but simple definition 
whose relevance is not obvious may tend to 
be eschewed for a more relevant-seeming 
technical statement. In general, the learning 
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appears to have progressed during the year. 
Special learning activities are unnecessary for 
these concepts, because the student has 
already had sufficient experience with them 
to render the incidental contacts in chemistry 
meaningful. 

Category E, “methods implications” —The 
concepts classified in this category are char- 
acterized as follows: 


1. The concepts are insights probably de- 
veloped through reflection upon methods 
of science. 

. The concepts were not “taught” as ver- 
bal statements; the terms were used, 
however, in context on many occasions. 


Summary, category E, “methods implica- 
tions”.—Table I shows that the initial mas- 
tery of the concepts in this category is about 
what one would expect from guessing alone. 
There appears to be a slight gain during both 
semesters, but it may not be reliable. There 
were small but significant shifts in individual 
items, but they involved few students. In 
general, one would conclude that the concepts 
have not been learned, that working with 
chemistry for a year by no means assures the 
understanding of even the simplest implica- 
tions about method. Since the “correct” re- 
sponses to these four items represent simple 
concepts, and since the other responses are 
fairly obviously not tenable, it may be safe 
to generalize as follows: If understanding of 
scientific method is an objective of the course, 
then activities must be especially planned 
for this objective. 

Category F, “not taught”. —The concepts 
classified in this category may be charac- 
terized as follows: 


1. The concept was not presented in the 
course. 

2. The concept is not a reasonable infer- 
ence from the experiences of the course. 


Summary, “not taught” category.—Table I 
indicates that the accuracy in this category 
is about what one would expect from the 
operation of chance alone. The results show 
slight shifts, particularly toward “mass” as a 
definition of “weight.” The terms are used 
synonymously by most chemists, since weight 
is a measure of mass, although weight is actu- 
ally a force. The obviously irrelevant re- 
sponses to weight have not increased, sug- 





68 JOURNAL OF EXPERIMENTAL EDUCATION 


gesting that there has been some orientation 
with respect to “weight” during the year. 

In general, the performance on the two 
items of this category suggests the conclusion 
that concepts not taught will not be learned 
—certainly a reasonable notion. 


6. Summary and Interpretation of Results 


1. Comparative effectiveness of various 
types of learning experience. 

a. The items of this part of the test have 
‘been categorized under six headings. Average 
scores on the items in the six categories differ 
in initial accuracy, final accuracy, and gain 
during semester and year. The methods of 
teaching of the items are homogeneous within 
each category and different between cate- 
gories, so that in the absence of any other 
systematic differences between categories it 
is reasonable to suppose that differences in 
learning are associated with the differences in 
general procedures of teaching. 

b. A clear picture of the differences in gain 
between the categories is revealed in Table III, 
in which the average gain for the population 
as a whole is compared with the possible gain 
defined as the difference between the initial 
average score and 100 per cent. 

The preceding analyses suggest that the 
gains in Table III may be considered as due 
to the interaction of two factors: initial 
knowledge (presented in Table I), and nature 
of the learning experiences (presented in dis- 
cussion). The nature of the interaction is 
clearly revealed by comparisons among the 
categories, thus: 

(1) The initial scores in categories E and 
F were approximately the values expected 
from the operation of random guessing. The 
items were not taught, i.e., although closely 
related concepts were used in class discussion, 
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the specific aspects called for in the items 
could be learned only as insights developed 
by the students without the aid of specific 
experiences planned for the purpose. The gain 
is seen to be negligible. 

(2) The three categories A, B, and C may 
be thought of as establishing a continuum 
with regard to directness of experience. Con. 
cepts in the experimental category A were 
learned through direct laboratory experience: 
category B, functional, contains concepts 
whose learning resulted from daily use in 
describing or explaining experimental results, 
ie., concepts that could be associated with 
several experiences in the laboratory; cate. 
gory C, theoretical, contains concepts whose 
learning must be almost entirely the result 
of vicarious experiences. Table III reveal 
that the differences in learning are spectacular 
(particularly for the year). To support the 
generalization that the more direct the experi- 
ence the higher the gain, one would have to 
take account of differences in initial scores, 
for the magnitude of the gains may be re 
lated to the initial scores in one of several 
different ways: 


(a) If the initial score is high, there is less 
opportunity for gain and less gain is 
anticipated. 
If the initial score is high, the student 
has a sufficiently good conceptual 
framework that additional learning is 
easy and a large gain is anticipated. 
(c) Within fairly wide limits, the magni- 
tude of the initial score cannot be used 
to predict the magnitude of the gain 


(b) 


Comparisons among the categories of initial 
scores and gains provide some evidence as to 
which of the three points of view is most 
appropriate. In two of the categories (A and 
B) the initial scores are practically the same 


TABLE III 
PER CENT THAT AVERAGE GAIN IS OF POSSIBLE GAIN 


Classification of Items 


Experimental 
Functional 


Common Usages 
Methods Implications 
Not Taught 


Per Cent of Possible Gain 
Number First Year 
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and the generalization about directness of 
experience is supported. In category C, the 
initial scores are very low, the learning ex- 
periences are indirect, and the gain is low. 
The smallness of the gain again accords with 
the generalization that direct experiences are 
most effective, and it also agrees with the 
second assumption about the role of initial 
understanding, to the effect that little learn- 
ing is anticipated if the student has no initial 
conceptual framework. There is no way of 
telling from these data which factor (initial 
score, or directness of experience) is primarily 
responsible. 

(3) The gains in categories B and D are 
very similar, and the learning experiences 
were of the same order, i.e., in both categories 
the learning resulted from the need for using 
the concepts for purposes of describing chem- 
ical situations. The initial scores, however, are 
quite different. This tends to support the third 
position with respect to magnitude of initial 
scores, that within fairly wide limits, the mag- 
nitude of the initial scores cannot be used to 
predict the magnitude of the gains. Applica- 
tion of this notion to the comparisons among 
categories A, B, and C would rule out the 
effect of initial knowledge as a factor in 
accounting for the low gain in the “theoret- 
ical” category and would leave directness of 
experience as the primary factor in learning. 

c. The foregoing analysis has led to the 
conclusion that the more direct the experience 
the greater will be the learning, regardless of 
the magnitude of the initial knowledge (with- 
in limits not specified). At this point it may 
be well to evaluate the extent to which the 
conclusion is justified from the experiment. 
Generally speaking, two considerations enter 
at this point. First, it must be shown that the 
items in a given category fairly represent that 
category. None of the categories contains 
more than seven items; the seven specific be- 
haviors elicited in such a category would 
seem to be insufficient to serve as a basis for 
sweeping generalizations about behviors. On 
the other hand, if it can be shown that in each 
of the categories the sampling is excellent— 
that each specific item epitomizes its category 
—then the generalization is much more 
tenable. Detailed analyses of the results from 
each item were necessitated by this consider- 
ation and represent an effort to present the 
evidence needed to decide whether the cate- 
gories are well established in this experiment. 
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The particular issues in this experiment are: 
(1) Has it been shown that the learning ex- 
periences of all the items in a given category 
are similar (and different from the learning 
experiences of the items in other categories) ? 
and (2) Have the categories been correctly 
characterized in general terms? In the writer’s 
opinion the discussions provide affirmation on 
both issues. 

The second demonstration has to do with 
the reliability of the experiment. If it is agreed 
that the categories exist and have been appro- 
priately described, then the next question is: 
How “real” are the observed differences in 
learning in the categories, i.e., to what extent 
can the differences be accounted for by fluc- 
tuation of uncontrolled randomly-operating 
factors? Since the categories are assumed to 
control factors relating to instructional pro- 
cedures, the major remaining possible source 
of uncontrolled factors is the individual 
learner. Since there are 144 learners in the 
semester group and 84 in the year group, it 
is highly unlikely that the same chance fluc- 
tuations would operate in the same way with 
any noteworthy fraction of the population; 
therefore, the differences in gains could be 
due only to systematic differences in learning 
procedures. The writer believes the conclusion 
that directness of experience is the major 
factor in the learning is justified from this 
experiment. The fact that this generalization 
is generally accepted in educational psychol- 
ogy increases the validity of the conclusion. 

2. Comparative effectiveness of the three 
methods of teaching.—The preceding analyses 
suggest that the most effective learning of 
concepts comes through direct experience in 
experimental situations designed to facilitate 
the learning of the concepts. Next in effec- 
tiveness appear to be experiences eliciting the 
use of the concepts for communication of 
description of experimental procedures, ob- 
servations, and results in a wide range of situ- 
ations. Less effective is the vicarious verbal 
experience with theoretical concepts that are 
not connected functionally to any operations 
by the learner. Least effective is poorly moti- 
vated learning through induction from experi- 
ence of concepts that are not held as objec- 
tives by the students. 

The rationales and the teaching procedures 
of the various methods as described above 
suggest two bases for prediction that the 
experimental groups would demonstrate supe- 





70 JOURNAL OF EXPERIMENTAL EDUCATION 


rior gains: (1) On the assumption that the 
foregoing hierarchy of effectiveness exists, 
the learning experiences were planned at a 
more operational level in the experimental 
groups; and (2) when the experiences were 
at the same level in the hierarchy they were 
expected to be more effective in the experi- 
mental group, because the objectives of the 
day’s work were more clearly identified and 
striven toward. The data are inadequate to 
indicate what portion of any observed supe- 
riority may be due to either factor alone. 

a. One procedure for comparing the out- 
comes under the three instructional methods 
would be to tabulate the number of gains (at 
some arbitrarily chosen level of significance) 
in correct responses made by the groups. 
(The results of the procedure will be found 
in the original report.) 

b. A more statistically adequate procedure 
for comparing the gains under the three meth- 
ods of instruction is through application of 
analysis of covariance, since the comparison 
is then not confined to gains at the 5 per cent 
level or better. Consideration of all the data 
might be expected to produce more reliable 
differences among the methods. Table IV 
compares the average gains in general accu- 
racy (right responses) among the three 
methods. 
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cated in Table IV. It appears that the hypoth. 
eses about achievement on Part I of the cri- 
terion test have been confirmed, within the 
statistical and other limitations of the experi. 
ment. 


3. Summary of Conclusions 


a. The rationale of the experimental method 
includes the initial assumption that learning 
is most effective when it comes through direct 
sensory experience of the learner. This 
assumption was regarded as an hypothesis 
and tested through analysis of the learning 
of chemical concepts as measured in Part | 
of the criterion test. A hierarchy of experi- 
ences arranged in order of effectiveness was 
identified, and the existence of this hierarchy 
appears to support the initial assumption and 
thus to add validity to the rationale. 

b. A number of interpretations of the course 
of the learning of various concepts were 
attempted. These interpretations are guesses 
arrived at inductively from detailed study of 
single items, but they may be useful as 
hypotheses for further investigation. Among 
these possible hypotheses are: 

(1) In proceeding from wrong to right re- 
sponses with regard to a concept, the students 
are likely to pass through a stage in which 
the wrong response is avoided and relevant 


TABLE IV 


LEVEL OF SIGNIFICANCE OF DIFFERENCES IN GAINS AMONG METHODS, GENERAL 
ACCURACY, Part I, By ANALYSIS OF COVARIANCE 


(greater) 


(lesser) 
First Semester 


Table IV shows that the reliability of the 
differences in gains between the experimental 
groups and the other groups is quite marked, 
whereas the superiority of the semimicro 
groups over the macro groups is unreliable. 
The differences are more reliable than the 
hypotheses anticipated: according to the spe- 
cific hypotheses, the experimental group was 
expected to be superior to the other groups 
at a level of significance between 10 and 30 
per cent; actually, in two out of four cases, 
the reliability is considerably greater than 
this. The specific hypotheses show that no 
difference between the gains in the semimicro 
and macro groups was anticipated, and this 
agrees well with the 40 per cent level indi- 


Experimental 


vs . 
Semimicro 


Experimental SemiMicro 
vs vs 
Macro 


437% 
—40% 


Macro 


2.38% 
10% 


but incorrect responses are accepted. In gen- 
eral, “relevant” in this connection means 
that the response refers to a class of sub- 
stances or a process with which the student 
has had experience at the time of exposure to 
the concept. It is possible to accept and sub- 
sequently reject more than one relevant re- 
sponse, depending upon the nature of the 
experiences with the concept that one has at 
different times. During this period it seems 
likely that the concept is purely connotative; 
it is connected with isolated experiences and 
there is no apparent success (probably no 
effort) to rationalize the concept into any 
consistent chemical scheme. This period may 
last up to a year at least (this study has no 
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evidence beyond a year). Selection of the 
right response may indicate that: (1) the 
student has rationalized the concept; (2) the 
student has learned it verbally; or (3) the 
student has finally “discovered” it by abstrac- 
tion from his various experiences with it. 

(2) Opportunity for effective learning of a 
particular concept does not necessarily mean 
that a closely related concept will be learned. 
Thus one definition of a term may be learned 
through experiment and another definition of 
the same term, at a slightly more theoretical 
level, may not be understood any better than 
formerly. There seems little doubt that inte- 
gration of relationships by the students is far 
from automatic. 

(3) When students learn a concept first at 
the operational level, and later at the theo- 
retical level, they appear to be confused. It 
is suggested that an important adjunct to 
teaching of this sort of concept would be the 
teaching of the nature of definition and its 
various levels, and the classification of defi- 
nitions according to their level in the 
operation-theory continuum. 

(4) Mere theoretical study of a concept is 
relatively ineffective in correcting a popular 
misconception. 

(5) There may be a tendency for students 
to abandon popular but correct concepts for 
more technical relevant but incorrect re- 
sponses. Laboratory experiments may be able 
to raise doubts in the minds of the students 
about points that were once fairly obvious to 
them. 

(6) The difference in gain in general accu- 
racy of the experimental groups over the other 
groups is statistically significant between the 
1.5 and 14 per cent levels. There is no reli- 
able difference in the gains of the semimicro 
and macro groups. It seems reasonable to 
conclude that the experimental method of 
teaching, as outlined above, is more effective 
than the methods in the semimicro and macro 
groups with respect to the objectives meas- 
ured in Part I of the criterion test. 


II. THE RELATIONSHIP BETWEEN EXPERI- 
ENCE AND VERBALIZATION IN THE 
LEARNING OF THE ATTITUDE 
OF OVERCAUTION 


In this experiment, it has been useful to 
recognize that there are two kinds of methods 
differences: (1) the general or summarized 
differences between groups of students taught 
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by different methods, and (2) the specific 
differences within any particular group. These 
features are illustrated in the treatment of 
data in the preceding section. The first kind 
of difference is given by the statement that 
the reliability of the difference in gains be- 
tween semimicro and experimental groups was 
at the 1.5 per cent level. The second kind of 
difference is given by the statement that 
within the entire population of the experi- 
ment, the gain in learning of concepts closely 
related to experiments was almost twice the 
gain on concepts used mainly for communi- 
cation of chemical information and proce- 
dures. 


By making one reasonable assumption, that 
the abilities to be discussed now were not 
“taught” in the control groups, it is possible 
to regard differences in gain between experi- 
mental and control groups as a function solely 
of the “teaching” under the experimental 
method. Since the activities employed in the 
experimental method are known in detail, 
and since these activities were guided to the 
best of the writer’s ability by the stated 
philosophy, it should now be possible to com- 
pare differences in the learning of various 
abilities between the experimental and con- 
trol groups to check aspects of the theory, as 
applied in the instance of this experiment. If, 
in addition, it can be shown that the condi- 
tions of learning each ability were typical of 
a population of similar sets of conditions, then 
one is in a position to generalize the results 
of this experiment to the level of hypotheses 
about the factors affecting the learning abil- 
ities in general. While such generalizations 
would not have been “proved”’ in this experi- 
ment, their agreement with principles stated 
would tend to increase the validity of the 
principles. It may be of interest to note, inci- 
dentally, that many of these principles were 
used in making the predictions incorporated 
in the specific hypotheses, and it would be 
most unlikely that an induction of principle 
from observed differences would not agree 
closely with principles that accurately pre- 
dicted the differences. The induction from 
observed differences is more elegant, in that 
it starts from quantitatively described obser- 
vation, and it is probable that the predictions 
themselves were based upon indirect applica- 
tion of recollected inductions from analagous 
cases. 
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1. Learning of abilities: amount of experi- 
ence.—Examination of the reliability of the 
differences in gain on general accuracy with 
each of the abilities studied provides oppor- 
tunity for gross estimation of the effect of 
amount of experience in learning of abilities. 
While other factors to be discussed below 
modify the picture considerably, the general 
outline at least is clear. 

The students had the greatest amount of 
experience (as judged by time so spent) in 
planning experiments; next came interpreta- 
tion of data. For these two abilities, the reli- 
abilities range from levels below 1 per cent 
to 1.5 per cent. Prediction of chemical occur- 
rences was a component of other activities, 
and the reliability of the difference in gains 
is quite different in the comparisons of the 
experimental group with the two control 
groups; other factors have had an important 
effect here. The subscores for “insufficient 
data” and for “decrease of wrong predictions” 
bear out the generalization that amount of 
experience is important, since the experimen- 
tal group had some experience with qualify- 
ing prediction during the study of interpre- 
tation of data, and the decrease in wrong 
predictions is a reflection of this experience. 
The remaining two abilities were not taught 
and contained few experimental elements 
that were taught; the reliabilities of differ- 
ences are low, and tend to favor the control 
groups as much as the experimental. 

2. Learning of abilities: verbalization and 
rationalization.—The factor here to be con- 
sidered is perhaps the most important ele- 
ment in the learning of the abilities studied. 
Its importance, incidentally, has been 
assumed throughout the discussion in the 
statements as to whether a given ability was 
“taught”. To be “taught”, there must be both 
opportunity for experience and verbalization 
of that experience. The ability to interpret 
data is an outstanding example of an ability 
that was verbalized and rationalized by the 
students. The ability was defined as 17 spe- 
cific behaviors. The students learned to recog- 
nize the types of situations in which each 
behavior applied, and they learned criteria 
or maxims or conventions by which these 
specific behaviors could be guided. Learned 
in this way, the skills became independent of 
any particular content, as is well shown by 
the fact that the content used in learning the 
ability was all chemical subject matter, 
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whereas the evaluation instrument dealt ey. 
clusively with non-chemical subject matter, 
This is striking evidence of skill learning 
divorced from content determiners. The eyj. 
dence that the ability was rationalized will b 
presented in connection with affect as a factor 
in learning; in brief, the rationalization was 
sufficiently great to offset the effects of ap 
attitude whose effects pervade most of the 
other learnings, even though this attitude may 
have been developed primarily in connection 
with interpretation of data! 

3. Learning of abilities: attitude —In the 
original report the writer has attempted to 
explain various features of the patterns oj 
scores in terms of the development of an atti- 
tude of overcaution by the experimental 
group. The problem is complicated somewhat 
by the fact that the macro group developed 
a considerable portion of the opposite atti. 
tude, that of going beyond the data. Since 
these two effects operate in the same situa- 
tions, it is difficult to distinguish either as a 
cause of differences, except when it can be 
measured separately. For this reason, the fol- 
lowing discussion is confined to the compari- 
sons between the semi-micro and experimental 
groups. 

Scores for “overcaution” and for “beyond 
data” were found with regard to three dif- 
ferent abilities. In each case, “overcaution” 
was measured as the number of rejections of 
the opportunity to come to a decision when 
the evidence was adequate for a decision to 
be made. “Beyond data” was the opposite 
characteristic, the number of decisions re- 
corded for which the data were insufficient. 

The initial scores for the three groups, cal- 
culated as the percent of opportunities for 
displaying the trait, are presented in Table V. 

The groups are seen to much alike, but 
very different from test to test. Assuming 
that chance plays a large part, one might 
attempt to account for the differences between 
tests as a function of different numbers of 
ways available for displaying the traits on 
each item. Calculation shows that the ex- 
pected ratios of “beyond data” to “overcau- 
tion,” on the basis of chance alone, are as 
presented in Table VI. 

It is thus demonstrated that taking account 
of differences in numbers of opportunities to 
reveal the traits (by using per cent of pos- 
sible opportunity), and taking account of the 
number of ways of marking each item, still 
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TABLE V 
INITIAL SCORES FOR “OVERCAUTION” AND “BEYOND DATA” 


Overcaution 
Experi- Semi- 
mental micro 


12. 80 


Planning of 
- E = 38.02 


xperiments. 32.49 


15. 89 


Macro 


15. 48 
34. 68 


7. 68 44.44 


Beyond Data 
Experi- Semi- 
mental micro 
47.46 52. 54 


26. 22 27.26 


48.81 


TABLE VI 
EXPECTED RATIOS OF “BEYOND DATA” To “OVERCAUTION,” ON THE BASIS OF CHANCE 


Beyond data 
Overcaution 


Expected by 
chance alone 


leave a large discrepancy between relative 
amounts of “overcaution” and “beyond data,” 
as measured by the three tests. The necessity 
for reporting the name of the test along with 
the description of a student’s status is thus 
apparent.** 

It is found that the experimental group 
failed to increase in overcaution on interpre- 
tation of data, as compared with the semi- 
micro group, that it increased very reliably 
in overcaution on planning of experiments, 
and that it indicated some increase in over- 
caution in prediction. This is particularly re- 
vealing, because the importance of proper 
qualification of data was developed mostly in 
connection with experiences relevant to inter- 
pretation of data. In terms of time spent, the 
students used probably between 5 and 10 
times as much in situations conducive to 
learning to plan experiments as they did in 
situations directly relevant to interpretation 
of data. Little time was spent in gaining ex- 
periences relevant to prediction. The above 
description may be summarized as follows: 


a. It is believed that the students became 
aware of overcaution mainly with regard 
to interpretation of data, and that this 
awareness resulted in the development 
of an overcautious attitude in general. 


b. The experiences relative to interpreta- 
tion of data were sharply focused and 


% Comparisons of the initial scores for “‘overcaution” and 
“beyond data’ on these three sections reveal a much over- 
looked point: that attempts to describe the status of a stu- 
dent with regard to these two traits are meaningless except 
as a function of a particular test. 


Experimental 
1.20 3.5 
1.03 0.81 
3.00 4.7 


Observed 


Semimicro Macro 


4.1 3.1 
68 
7 


0.72 0. 
3.1 5. 


were both rationalized and verbalized. 
This resulted in an increase in general 
accuracy without the development of 
overcaution on this test. 


. There were more experiences relative to 
planning experiments than to interpret- 
ing data. “Planning experiments” is a 
paper-and-pencil test whose situations 
are very similar to the learning situa- 
tions. Accuracy increased as a result of 
much experience incompletely ration- 
alized. The attitude of overcaution re- 
sulted in increased errors of this sort, 
even though accuracy in general in- 
creased, 


. There were few experiences relevant to 
prediction. The major shift revealed was 
an increase in overcaution, which is be- 
lieved to be evidence that a general 
attitude was developed. 


The conclusion is that some experience, 
completely rationalized, may be as effective 
as much experience incompletely rationalized 
in the development of general skill abilities. 
Furthermore, a general attitude can be built, 
but its operation may be inhibited in those 
situations that have been completely ration- 
alized. Even in studied situations, if ration- 
alization is not complete, the attitude may 
exert control over the behavior, and in new 
situations also the attitude may be much more 
effective than chance in determining the 
nature of the behavior elicited. 
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The general conclusion of detailed study of 
the learning of the five abilities’ is that there 
is experimental evidence for the importance 
of learning of the factors: 

. Frequency of relevant experience. 
. Intensity and multi-perception of ex- 
perience. 

>. Function of learned behavior. 


. Verbalization and rationalization of 
learned skills. 


. Development and effect of attitudes. 


This constitutes a check on the validity of 
similar methodological principles developed 
in the theory of science teaching. 


III. SUGGESTIONS FOR TEACHING CHEMISTRY 


As a result of this experiment, some pro- 
posals for the improvement of instruction in 
chemistry for students similar to those studied 
may be offered. The following suggestions 
receive direct support from the experiment: 


1. Insofar as possible, plan activities re- 
quiring group planning and execution with 
responsibility divided among individuals. 

2. Develop experiments in class discussion. 
The saving of time in the laboratory would 
be considerable, if twenty minutes of the quiz 
hour could be devoted to developing the 
rationale and technique of each experiment. 

3. Follow up each experiment in class dis- 
cussion, formulating as a class the best pos- 
sible statements of conclusions, and identify- 
ing the assumptions and sources of experi- 
mental error in the experiment. 

4. Widen the range of activities in the 
quiz-laboratory hour to provide experiences 
conducive to the development of a broader 
range of abilities. 

5. Use the development of chemistry as 
the basis for developing understanding of 
scientific method, and provide real opportu- 
nity for the student to use scientific method 
in planning and following up experiments. 

6. Eliminate or re-design the test-tube 
survey type of experiment, except when it 
can be shown that the range of superficial 
experiences with a wide variety of chemicals 
results in valid and important generalizations. 

7. Attack the basic quantitative problems 
through supervised drill and class analysis 
of principles and methods. Time may be 

13 Presented in Chapters VI-X of the complete report. 


[Vol. 13, No, 


found for this by changing the quiz-reviey; 
to short diagnostic tests and discussions of 
the test problems. 

8. Teach a verbalized rationale along with 
all skills. The rationale should include criteri, 
for deciding what behaviors are appropriate 
to solve the problem in accord with desirabk 
values. 


The following suggestions are reasonable 
extensions of the principles demonstrated jn 
this experiment: 

g. Theory needs to be re-considered as ap 
objective for the course. The learning of the 
theory is incommensurate with the time and 
effort spent in trying to teach it. It might b 
better either to omit all theory not needed 
by the student or to redesign activities, 
that the role of theory is more explicit. More 
emphasis upon explanation and prediction 
would tend to have this latter effect. 

10. Consider the possibility of replacing 
the lectures with demonstration-discussions 
or of eliminating them completely. This con- 
sideration should embrace: 


a. The amount of the lecture time devoted 
to outlining theory that is not learned. 


b. The amount of the lecture time devoted 
to performing experiments that either 
are or could be performed with more 
profit by the students. 

. The amount of the lecture time devoted 
to attempts to build, through verbal in- 
struction, abilities whose development 
depends upon participation in specially 
planned activities. 

. The amount of lecture time devoted to 
outlining the material in the book. 
(Perhaps having the student outline the 
book during supervised study periods 
would be more profitable to him). 


11. Plan each experiment (except those 
whose function is exploratory) to involve 
prediction of results and comparison of re- 
sults with prediction. 

12. Teach fewer and larger ~eneralizations 
through providing opportunities for experi- 
ence with physical and biological material as 
well as chemical. 

13. Give diagnostic pre-tests and plan 
sufficiently “rich” activities that individual 
students can be guided into having experi- 
ences most profitable to them. 
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14. Use the book only in connection with 
definite assignments to be fulfilled through 
and at the time of reading the book: Outlin- 
ing, answering guide questions, using data in 
the book to support arguments, and writing 
out definitions of words may be suggested to 
guide the reading of the book. A reading and 
problem clinic might well be established to 
provide assistance for the students needing 
it. Since much of the material presented in 
the text might well be worked out under 
supervision by the student, it might be well 
to eliminate the book and replace it by a 
handbook or a shelf of reference books. 

The following suggestions conform with the 
theory of science teaching, but go completely 
beyond present practice: 

15. Introduce into the course much more 
information about how plants operate, their 
relationships with special interest groups and 
classes in the population, and their societal 
privileges and responsibilities. Also help the 
student to understand what the job of being 
a chemist is like. 

16. Plan experiments to reveal the func- 
tions of chemistry in industry (prevention of 
corrosion, synthesis of compounds for specific 
purposes, development of substitutes, action 
of fertilizer, analysis of food products and 
other consumer goods, and the like). These 
topics should then be placed in their societal 
context through outside reading, lecturing, 
community survey, analysis of advertising, 
or other means. 

17. Teach the dignity and intelligence of 
man by showing that scientific principles are 
made rather than “discovered” by man. 

18. Plan a long-range program of evalua- 
tion-guided research to develop the most 
effective possible teaching procedures. 


SUMMARY 


In the experiment one general type of out- 
come, the ability to think scientifically, has 
been studied through comparisons of two 
instructional methods. The control method 
was believed to be fairly representative of 
the most usual instruction in freshman col- 
lege chemistry. The experimental method 
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was a modification of the control methods 
and was guided insofar as possible by a rea- 
sonably comprehensive stated theory of sci- 
ence education. The experimental method as 
here applied was delimited by the need for 
control in the comparisons, by the present 
stated objectives of the control method, and 
by limits in present understanding of teach- 
ing techniques. It is believed to have repre- 
sented a desirable step, but to have nowhere 
approached the stated theoretical possibilities 
in science education. 

The evidence in the experiment was ob- 
tained from a four-hour test battery compris- 
ing six sections. Each section presumably 
elicited specific behaviors believed to be per- 
tinent to a general ability. A rationale for 
each ability and a description of the evalu- 
ation instrument used in its appraisal have 
been presented. The description of achieve- 
ment of each ability was obtained from a 
pattern of scores; there were 31 scores 
obtained from the test battery. 


The major findings of the experiment were 
based upon the agreement between 31 specific 
hypotheses and observed scores obtained by 
students under the two methods of instruc- 
tion. Since the students were tested at the 
middle of year, as well as at the beginning 
and end, it has been possible in some cases 
to describe what appears to be the course of 
learning of certain behaviors. Interpretations 
as to the validity of certain facets of the 
theory resulted from comparisons by the 
method of analysis of covariance of achieve- 
ment in the various abilities under the two 
methods. It is believed that the theory has 
been checked in certain important aspects 
dealing with the role of experience and the 
value of rationalization of behavior. 

Further interpretations of the educational 
significance of the differences in outcomes 
under the two methods have been discussed, 
the general nature of some desirable further 
research has been considered (in the original 
report), and a series of suggestions has been 
offered for improvement of instruction in 
freshman chemistry within the existing ad- 
ministrative framework. 





